My Key Learning from Paris

This article originally appeared on Matt Fischer’s blog. Matt Fischer is a principal engineer at Time Warner Cable. He’s also a brewer of beer and a hiker of mountains in his spare time.

I learned a bunch of things in Paris: [French food is amazing](http://www.yelp.com/biz/le-volant-basque-paris), always upgrade OVS, but the most important thing I learned in Paris?

>__All operators have the same problems.__

Every operator session was a revelation for me. As it turns out, I’m not the only one writing Icinga check jobs, Ansible scripts, trying to figure out how to push out updates, or fighting with ovs. These sessions not only provided validation to let me know that we’re not doing stuff totally wrong, but also allowed everyone to share solutions. Below are some of the themes that I found particularly interesting or relevant to this premise.

###Upgrades

One topic in which operators have a common interest was upgrades. Only a few of the operators have any real experience with this, but it’s been a pain point before for many. Some have resorted to fork-lift style, “bring up a new cluster” upgrades, which is not nice for customers and requires extra hardware. How do we solve this? It seems that some projects have upgrade guides, but finding documentation on a holistic upgrade is difficult, especially gathering information on ordering (Cinder before Nova, for a contrived example). Issues specifically called out in the session were config changes (deprecations and new additions) and database migrations (including rollback). Rollback is especially worrying as it’s not well tested. There was no solution for this except a resolve to share information on upgrading via the [Operators Mailing List](http://www.mattfischer.com/blog/lists.openstack.org/pipermail/openstack-operators/).

See notes from the rest of the Upgrades session [here](https://etherpad.openstack.org/p/kilo-summit-ops-upgrades).

###CI/CD & Packaging

Another issue that operators face after an Openstack deployment is how to get fixes and new code out to the nodes. An upstream bug fix might take a few days, plus a week for a backport, plus a month for the distro to pick it up. This means that even if the operator fixes it themselves, they still have a delay in getting it released. During this delay you might be impacting customers who may not be that patient. The solution for many is using a custom CI/CD system that builds “packages” of some sort, whether distro packages or custom built vdevs. It was interesting to hear that people have a myriad of solutions here. We use a toolchain quite similar to the upstream Openstack toolchain that outputs Ubuntu packages. However even with this method we still rely on dependencies and libraries provided by the Ubuntu Cloud Archive, as much as possible anyway.

There were a bunch of talks on this subject, and not just operator talks. Here are links to the ones I attended:

If you know other ones that I should go back and watch, please comment.

###Automation (Puppet)

We use Puppet to configure and manage our Openstack deployment, so this was an area of interest to me. It was great to see a 100% full room with everyone focused on improving how Openstack is configured with puppet. From discussions on new check jobs to a conversation about how to better handle HA, it was great to see the real sense of community around Puppet.

You can see the notes from the session [here](https://etherpad.openstack.org/p/puppet-openstack-paris-agenda).

The second puppet session was more of a “how are you solving X” and “how could it be better.” This was also a great session with some [interesting notes](http://kilodesignsummit.sched.org/event/61f2d6c7c34993193223c1f9b0c5e343#.VHk5rqTF-38).

###Towards a Project?

Finally, one of the more interesting things happened later in the week when Michael Chapman and Dan Bode grabbed me in the lobby and wanted me to preview an email that was about to go out. In brief, they were proposing an operations project. This was born of the realization that I’d also made — we’re all solving the same issues. This email led to an impromptu meeting of about 30 people in the lobby of the Meridien hotel and the beginnings of an [Operations Project](https://wiki.openstack.org/wiki/Operations).

It was not the ideal venue, but we still had a great discussion on a few topics. The first was, can we have a framework that will allow us to share code? We all agreed that this project’s purpose is not to bless specific tools (for example Icinga) but instead to allow us to share Icinga scripts alongside other monitoring tools. Although this had been tried before with a github repo, it had only a couple contributions, so hopefully this will improve. We then dove into all of the different tools that people use for packaging and whether or not any could be adapted as general purpose. Having a good tool for Ubuntu/Debian packages seemed to be something in great demand. Adding debian support for Anvil seemed to be something worth investigating further.