Nova Updates - Kilo Edition

Welcome to the PTL overview series, where we will highlight each of the projects and the upcoming features that will be in the next OpenStack release: Kilo. These updates are posted on the OpenStack Foundation YouTube channel, and each PTL is available for questions on IRC.

The Kilo design summit was in Paris, and according to Still, the design summit felt very productive. In his opinion, Nova has an increasingly clear vision of what users need, and what the project needs to do.

“We understand that there are big architectural improvements we need to make. People signed on to those improvements and we have a plan. We know where we’re going and what we need to do. We just need to execute on it.”

The special thing about Nova is that it is part of the majority of OpenStack deployments. While Nova provides access to compute resources, it also ties into other OpenStack projects. For example, if you make an API request to OpenStack through a virtual machine, Nova builds the virtual machine and gives you access, often by orchestrating with other systems.

New Process Brings in Operators

One of the biggest changes in the project has been a new specification process in order to encourage operators to participate in reviews. Operator feedback is critical and the Nova project has learned that tweaking things early with the input of operators is much less expensive that having to do it later in the process.

The new process begins with a formal design document (called a specification, every OpenStack project has them) that defines what is being implemented. These documents are reviewed separately from the code, and they are reviewed before the code.

Now, in order to make any changes in the project beyond a bug fix, you need to propose the specification. The specification is reviewed, tweaks are suggested, and only then is the code is worked on.

During the review of the specification, operators then share their perspective on the proposed change or addition. Operators often respond in the specification review by saying: “Hey this is a cool feature but you need to think about these things that would hurt me.”

This gives users the opportunity to comment, either in specific lines or to make a more general comment, and participate in the interactive review process.

“Another advantage of this new process is that we can say, ‘We’ve released this feature, and this is exactly how we intend it to work.’ This makes the features more useful to operators and deployers.”

Looking for more information about specifications?

Juno Specifications

Still was positive about the progress made during Juno, and the success of changes made to the way blueprints and specifications are handled.

First, the new specification process slowed down progress — which was a good thing. Still was careful to note that this was done on purpose in order to give the proper attention to the higher quality features in development. An added benefit to this slower pace is that the team is rewriting features less than in the past.

Second, the team implemented fast-track approval for special-case specifications. For example, if a specification didn’t get into the previous project cycle, fast-track approval helps people pick up the specification work in the current cycle. The team doesn’t have to pause to get paperwork done, and work begins right away.

Third, the concept of trivial blueprints is helping to unblock work. In Juno, everything had to have a blueprint specification. Now in Kilo, that isn’t the case. With trivial blueprints, one can write just a few sentence blueprint in launchpad and take it to the weekly Nova meetings. At the meeting, the blueprint is advocated for being trivial, and if agreed upon it is approved on the spot.

Fourth, the addition of backlog blueprints has increased the number of people creating and working on blueprints. With backlog blueprints, the Nova core team and the user have a more efficient process to talk about a feature and what it would look like. Still has noted that large deployers love backlog blueprints, especially if they don’t have a ton of software engineers. They can write a short user story, describe the use cases or write about the problems they are having and send it off for review and approval into the backlog. An added benefit is that new developers that are looking for opportunities to get involved now have a well-defined set of features they can work on right away.

Check out the summary of Blueprints implemented in Nova during Juno

Priorities for Kilo

Still then spent the rest of his webinar discussing the priorities that address the architectural problems we see in Nova. Here a few highlights:

Cells v2

In order to really scale Nova, say from 500 hypervisor nodes up to 50,000, you’d break your Nova deployer into a series of sub-novas, called cells. With growing numbers of very large deployments of Nova, users want to use Cells v2, but it isn’t yet feature complete. Rackspace, CERN and NeCTAR in Australia have all deployed cells for use in production. The goal of cells v2 efforts in Kilo is to produce a replacement for the current cell models, so cells become a first class Nova citizen.

Cells v2 etherpad

Continued object transition

Still emphasized that while this was internal, and decidedly not ‘sexy work,’ it is very important to the project. Moving to objects will make the code easier to read and more maintainable, and will lead to online database migrations. This work has been ongoing for a couple of releases now, but Still believes that we are getting close to the end.

Objects etherpad

Scheduler

Nova scheduler isn’t flexible enough, and while the ultimate plan is to pull out the scheduler so that it is a separate project in OpenStack, the first step is to make it more flexible in order to implement core features.

Scheduler etherpad

v2.1 API

In some deployments, operators don’t want to upgrade client libraries each month. The V2.1 API is an effort to strongly version the API in order to improve the user experience by allowing deployments to upgrade only once a year or every few years. That way, when a client connects to OpenStack, it can tell the server which version it understands, and the server will degrade its responses to that version.

Functional testing

Nova has had unit tests, and integration testing, but no functionality testing. The testing work in Kilo will make it easier to test and debug Nova race conditions and edge cases.

Functional testing etherpad

No downtime DB upgrades

Operators say that one of their biggest problems when upgrading is running the database migrations. Currently, whenever a change is made to the schema, adding and deleting columns happens at the same time, which causes a forced database and API outage. In Kilo, we’re attempting to break out the changes you can make live from the contraction of the database by removing the bits that aren’t used anymore. This way, an operator can move to newer schemas in a live environment without affecting users, and later on they can take a planned outage to delete all the columns that are no longer used.

Continuous Integration

There are some aspects of Nova that are not continuously integrated. Still plans to continue to expand CI coverage so that users have the best possible quality of software.

Other approved specifications

Enforcing unique instance UUIDs in the SQL database
API improvements (get instance lock status for example, better result pagination, tagging for instances)
Hyper-V and libvirt SMBFS support
Ironic config drive support
Continued work on NFV support in libvirt (NUMA, CPU pinning, etc)
VMWare support for OVA, ephemeral disks, SPBM and vSAN

There are many other specifications proposed (both approved and in review) as well, and you can see them on Still’s blog here.

And be sure to watch Michael Still’s webinar in its entirety: