Making OpenStack production ready with Kubernetes and OpenStack-Salt - Part 1

This tutorial introduces and explains how to build a workflow for life cycle management and operation of an enterprise OpenStack private cloud coupled with OpenContrail SDN running in Docker containers and Kubernetes.

The following blog post is divided into five parts, the first is an explanation of a deployment journey into a continuous DevOps workflow. The second offers steps on how to build and integrate containers with your build pipeline. The third part details the orchestration of containers with a walkthrough of Kubernetes architecture, including plugins and prepping OpenStack for decomposition. In the fourth part, we introduce the tcp cloud theory of a “single source of truth” solution for central orchestration. In the fifth and final step we bring it all together, demonstrating how to deploy and upgrade of OpenStack with OpenContrail.

We decided to divide the process into two blog posts for better reading. This first post covers creating a continuous DevOps workflow and containers build.

OpenStack deployment evolution

At first glance you might ask, "Why would I add additional tooling on top of my existing deployment tools?" It’s important to explain why anyone should consider Docker and/or Kubernetes as tools for running OpenStack in production. At tcp cloud, we have deployed and operate a growing number of enterprise private cloud in production on OpenStack. Each OpenStack deployment is different (storage, network, modules) and any number of service combinations exist, given the varying needs of customers. There is one thing all cloud deployments have in common, however: deployment phases and initial goals. These have become evident and all customer journeys lead to an individual cloud maturity model. Let’s divide these phases of evolution into three sections.

RagTag

Characterized as untidy, disorganized, inharmonious. This is always the first step and sometimes the last for anyone who tries OpenStack. Every company or individual considering OpenStack as a private cloud solution has a common first goal: deploying OpenStack!

This journey typically starts on openstack.org and lands on deployment tools like Puppet, Chef, Ansible, TripleO, Helion, Fuel, etc. It is almost impossible to identify the right way to get OpenStack up and running without any previous experience. Even though all of them promise a simple and quick setup of the whole environment, you will probably end up with the following logical topology of a production environment. This diagram shows a typical production service-oriented architecture of a single region OpenStack deployment in High Availability. As you can see, this diagram is very complex.

fymhnnccdneirlcolbkm

The next thing you’ll discover is that current deployment tools cannot cover a complete production environment (bonding, storage setups, service separation, etc.). This is when the cloud team starts to ask themselves, "Do we need to really to setup an environment in one day, deploy in five minutes on a single machine or through a nice clickable graphical user interface (GUI)?" and "Are these really the key decision points that determine the right choice of our future production environment?" Standing up a stack is easy and deployment tools are one-offs? You cannot run them twice or are they repeatable? What about life cycle issues, like patching, upgrades, configuration changes, etc.
This brings us back to statement that no one can choose the right solution without the experience of “day two operations.”

Ops

The second phase, called Ops, or as we mentioned “day two operations.” Here’s a typical example: OpenStack is up and running, then you get an email from your security team that says, “Please upgrade or reconfigure RabbitMQ to prevent security vulnerability.” How can you do it with confidence? Your deployment tool cannot be used again. Now you’re dealing with day-to-day operations, which is more difficult than the deployment itself.

This led us to define the following criteria for everyday operations like patching, upgrades, business continuity, disaster recovery, automatic monitoring, documentation, backups and recovery. The general expectation is that Ops can be managed by the Deployment tool. However, the reality is that the Deployment tool does not do Ops.

As already mentioned, deployment tools are one-offs. Therefore, you start the build and Ops tools like random scripts (restart service or change configuration), manual hacks, tribal knowledge (production depends on specific people who knows how to manage).

The ideal workflow needs to include terms like repeatable patterns, a "single source of truth" (Infrastructure-as-a-Code), best practices, rigid-to-agile, declarative (desired state) and personalized cloud experience.

We did not want to develop this workflow by ourselves, so we found OpenStack-Salt an optimal tool. It’s been an official project under the big tent since May 2016. Its service-oriented approach covers almost all the above-mentioned parameters of an ideal workflow. It offers production-ready proven architecture managed as code.

a08pdih7wsjvx1bspwbl

However, our production infrastructure still looks like the figure below. It consists of about 23 virtual machines on at least three physical KVM nodes just for cloud control plane. We have to upgrade, patch and maintain 23 OS to provide flexibility and service-oriented architecture.

aj5uqgseukeyj6rwoycf

DevOps

Based on previous ideas we asked question “what about to treat infrastructure as a microservice?” This bring us from Ops to DevOps, which really means to treat OpenStack as a set of applications.

xn0fkw8nhgimnw0wbocx

Our defined criteria is that it must be composable, modular, repeatable, immutable and split applications from infrastructure. It has to break monolithic VMs to containers and micro-services.

We also decided that we didn’t want to reinvent the wheel to create a new project, but reuse existing knowledge invested in the OpenStack-Salt project.

These steps depict the evolution of OpenStack deployment in last two-three years. Now let’s take a look at how to build containers and micro-services instead of the monolithic deployments of the past. The following sections explain the DevOps workflow.

How to build containers

The configuration management era started a couple of years ago, when tools like Fabric, Puppet, Chef and later SaltStack or Ansible changed the approach to deploying application and infrastructure in companies. These tools ended the era of bash scripting and bring repeatable, idempotent and reusable patterns with serialized knowledge. Companies invested huge efforts into this approach and community brings opportunity to deploy OpenStack in almost every configuration management tool.

Recently the era of micro-services (accelerated by Docker containers) dawned and as we described in the DevOps workflow, containers should encapsulate services to help to operate and treat infrastructure as micro-service applications. Unfortunately, Docker pushes configuration management tools and serialized knowledge off to the side. Even some experts predict end of configuration management with Docker. If you think about it, you realize that Docker brings dockerfiles and entry points which invoke déjà vu of bash scripting again. So why have we invested so much into a single source of truth (infrastructure-as-a-code), as if we started from scratch? This is the question we had on our minds before we started working on the concept for the containerization of OpenStack.
The first requirement was building Docker containers in more effective way than just bashing everything. We took a look at OpenStack Kolla, CoreOS and other projects around that provide an approach for getting OpenStack in containers.

Kolla uses Ansible for containers build and Jinja for the parametrization of dockerfiles. The concept is very interesting and promising for the future. However, it is a completely new way of serialized knowledge and production operation of OpenStack. Kolla tries to be universal for Docker containers builds. There is, however, missing orchestration or a recommended workflow for running in production not only a single machine with host networking. Kolla-kubernetes project started almost month ago, but it is still too early to run in an enterprise environment. A lot of work must be done to bring a more operational approach. Basically, we want to reuse what we have in OpenStack-Salt as much as possible without a new project or maintaining two solutions.

We defined two criteria to leverage running OpenStack services in containers.

Use configuration management for building Docker containers as well as standard OS
Reuse existing solution – do not start from scratch and rewrite all knowledge into another tool just for containers build and maintain two worlds.

We created a simple repository Docker-Salt, which builds containers by exactly same salt formulas used for dozens of production deployments. This enabled knowledge reuse, for example when someone patches configuration in a standard OpenStack deployment, it automatically builds a new version of the Docker container as well. It provides the opportunity to use single tool for Virtual Machine OpenStack deployment as well as micro-services. We can mix VM and Docker environments and operate environment from one place without a combination of two or three tools.

Build the pipeline

The following diagrams shows building pipeline for Docker images. This pipeline is completely automated by Jenkins CI.

The reclass metadata model is deployment specific and it is single source of truth (described above), which contains configurations like Neutron plugin, Cinder backends, Nova CPU allocation ratio, etc. It’s a Git repository with a simple YAML structure.

OpenStack-salt formulas are currently used for tasks such as installing a package, configuring, and starting a service, setting up users or permissions and many other common tasks.

Docker Salt provides scripts to build, test and upload Docker images. It contains dockerfiles definitions for base image and all Openstack support and core micro-services.

Build process downloads the metadata repository and all salt-formulas to build salt-base image with a specific tag. Tag can be an OpenStack release version or any other internal versioning. Every configuration change in OpenStack requires a rebuild of this base image. The base image is used to build all other images These images are uploaded to a private Docker registry.

pdgmv1zamhnhsqp9pjck

The Docker Salt repository contains compose files for local testing and development. OpenStack can be run locally without Kubernetes or any other orchestration tool. Docker compose will be part of functional testing during the CI process.

Changes in current formulas

The following review shows changes required for salt-formula-glance. Basically, we had to prevent starting Glance services and sync_db operations during the container build. Then we have to add entrypoint.sh, which instead of a huge bash script that replaces env variables by specific values then runs salt highstate. Highstate reconfigures config files and runs sync_db.

You might notice that Salt is not uninstalled from container. We wanted to know what is the difference between container with or without salt. The glance container with salt has about 20MB more than glance itself. The reason is that both is written in python and uses same libraries.

The second post will offer more information on container orchestration and live upgrades.

This post first appeared on tcp cloud’s blog. Superuser is always interested in community content, email: [email protected]

Cover Photo // CC BY NC