It’s time for the community to help determine the winner of the 2020 Open Infrastructure Summit Superuser Awards. The Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner after the community has had a chance to review and rate nominees.
Now, it’s your turn.
The Workday Private Cloud (WPC) Team is one of eight nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate them before the deadline September 28 at 11:59 p.m. Pacific Daylight Time.
Rate them here!
Who is the nominee?
Workday Private Cloud (WPC) Team
How has open infrastructure transformed the organization’s business?
- The adoption of open infrastructure allowed Workday to become more agile and gave developers a faster time to market.
- Migrating applications running on bare metal to open infrastructure allowed Workday to apply critical security OS updates in a matter of days which used to take months. Workday’s growth and agile upgrade model required a very scalable and API driven infrastructure management framework.
- With open Infrastructure and a good CI/CD process, Workday is able to create and delete over 30,000 VMs in less than 45 minutes maintenance window. The success rate for these operations is over 99%.
- Lastly, the open infrastructure platform accelerated the deployment of Kubernetes cluster in Workday data centers. The scalability and reliability of the platform is helping Workday meet its SLA.
How has the organization participated in or contributed to an open source project?
Workday has been actively involved in open infrastructure projects by participating in all the Open Infrastructure Summits since the inception of its private cloud team. The team has presented Workday’s stories on scalability, deployment, performance, and operational challenges in the past six OpenStack Summits.
Workday engineering recently added support for encryption at rest on Ceph. It contributed to Chef cookbooks used for deploying open infrastructure, submitted bug fixes, and participated in code reviews.
Workday has also actively participated in several operators events and meetups. In 2018, Workday organized several open infrastructure meetup events in the East Bay Area.
What open source technologies does the organization use in its open infrastructure environment?
The organization heavily relies on open source technologies. From the open infrastructure environment, we are currently using Keystone, Nova, Heat, Glance, Neutron, Kolla, and Ceph.
What is the scale of your open infrastructure environment?
WPC is currently running 43 open infrastructure clusters running across five different data centers in the U.S. and Europe. The current number of cores is 422,000. The number of virtual machines running are 30,000 in production. The number of Kubernetes clusters is 70.
What kind of operational challenges have you overcome during your experience with open infrastructure?
- We have overcome several performance, scaling, and operational challenges. Workday’s application deployment and upgrade model puts a significant load on the open infrastructure controllers.
- We worked with the community to improve concurrent VM boot time by adding several features or code changes to nova. These changes were presented in great detail at the Berlin summit in 2018.
- Managing deployments with thousands of servers is challenging with a very small operations team. Our team built a lot of observability tools that allow the team to monitor events and identify performance bottlenecks promptly.
- To make our deployments scalable, we iteratively changed our architecture and adopted a design that is horizontally scalable.
How is this team innovating with open infrastructure?
Workday is innovating its data center architecture. Some of the biggest architectural changes are in network and storage solutions.
These changes are driven by business needs to scale Workday’s data centers to hundreds of thousands of servers.
The Workday private cloud (WPC) team is working on solutions that allow its open infrastructure platform to work more scalable, manageable, and easily upgradable. One use case requires us to support Border Gateway Protocol (BGP) with Neutron, a solution that is not yet present in the latest release. The team is working on developing a blueprint.