Verizon’s cloud platform, built on an OpenStack infrastructure distributed across 32 global nodes, is used by the networking team to provide network services to commercial customers. Because these applications are commercial products such as firewalls, SD WAN and routing functions provided by third party companies, not owned by Verizon, the applications sometimes had odd interactions with the underlying infrastructure and each other.
Issue: Uneven vendor application performance
The product engineering team realized that the applications owned by Verizon’s partners were each configured differently, and depending on how they were each configured had a significant effect on how they behaved in the environment.
SDN vendor performance variance became a pressing issue that significantly affected throughput in the field. For example, it was discovered that in many cases, when encryption was turned on, throughput was reduced by half. With traffic moving through multiple systems, it became difficult to determine the cause (or in some cases, causes) of problems and determine the fixes needed. Dramatic variation in vendor capabilities to fully take advantage of virtualized applications and infrastructures, to optimize those applications to OpenStack became a major challenge.
Solution: Create platform and processes to address issues
Verizon tackled this issue of inconsistency by building a production engineering lab with full testing capabilities. This lab environment, used for product development, production support, and troubleshooting customer configurations, gives a clear and efficient feedback loop that is useful for informing product managers, sales teams and customers with real world results. For instance, when a customer decides to run voice traffic through a firewall (not a common configuration), with the lab Verizon can access and analyze all the different nuances of that configuration. The lab is also used to work closely with vendors to optimize their virtualized applications. It supports the capacity to test both data center environments and edge devices.
As a consequence of developing the production engineering lab, Verizon now has the ability to insist on thorough and consistent testing of each vendor’s application. Verizon is able to take customer their production traffic and run it through the lab, making it possible to reproduce customers’ issues in the lab environment. Through verifying each application, testing them for performance-based on factors like encryption, and making full performance testing on all integrated service chains automated and mandatory, Verizon is able to provide a much higher level of value to their customers to prevent potentially unpleasant surprises.
User Story: Financial Services firm with high security and performance requirements
Customers look to Verizon with high expectations, and Verizon makes sure to work with each of their vendors to provide the testing and support that they need most to meet their SLAs.
One of Verizon’s customers, began to experience problems with low bandwidth and application microfreezing. This was a big problem for their security application. After some testing it was soon obvious that this behavior was common to all security applications running on the virtualized environment. Immediately, the team at Verizon began to make changes to how the VMs were stood up in the environment, without needing to change any aspects of the underlying hosted infrastructure itself.
Because the results of this case study affected nearly every single one of Verizon’s vendor applications, particularly where the customers had latency sensitive deployments and transports larger than 100 Mbps, the company has now developed new standards to support customer configurations.. All VMs for future applications are now automatically configured to be pinned to resources to avoid resource contention, vendors are mandated to support SRIOV networking deployment, and customers are cautioned about throughput behavior if they choose to turn on traffic encryption.
Customers want reliable performance, but they not uncommonly put unexpected demands on their services. By building a full lab and testing center, Verizon was able to test customer configurations and troubleshoot issues down to the individual feature level. As a result, Verizon as a large operator now has even more capacity to facilitate cooperation with all of its vendors—ensuring the integrated service chains perform as expected, solving issues uncovered during development or production, and quickly addressing any customer related issues.
The 2020 OpenDev event series was held virtually. The photo is from OpenDev 2017 that was focused on edge computing where Beth Cohen also presented a session.
- For Blizzard Entertainment, it’s “game over” on scaling complexity - August 10, 2020
- Verizon’s Optimum Performance Relies on Owning the Testing Process - August 6, 2020