Global research organizations gathered at OpenStack Day CERN to share how OpenStack and open collaboration support scientists answering the universe’s big questions.

GENEVA—You never know who you might run into at the world’s largest particle physics laboratory. The hallways of The European Organization for Nuclear Research (CERN) campus are full of scientists heads down studying particle collision. During my first visit, I walked past the couple who has been at CERN for over 60 years working the kinks out of the synchrocyclotron, spotted Belmiro Moreira showing off the rack of CERN’s first OpenStack cloud and engineers climbing through the Large Hadron Collider’s ALICE experiment.

Jonathan Bryce with Belmiro Moreira, Computing Engineer at CERN with the rack of servers that has been around since CERN first deployed OpenStack in 2011.

Each hallway collision produces an unmistakable wave of energy.

In the auditorium where the discovery of the Higgs Boson was announced, the first OpenStack Days CERN gathered 200 people and an additional 180 people from 30 countries via livestream. The two-day event offered one day of talks ranging from “Supporting DNA Sequencing at Scale with OpenStack” to “The Cookbook of Distributed Tracing for OpenStack” plus another of unforgettable site visits to the ATLAS and ALICE experiments, as well as their onsite datacenter. (More on these to come!)

The crowd was a mix of researchers seeking answers to questions like: What is the composition of the universe, and how do galaxies form? How is the world’s climate evolving? What is the DNA sequence of common cancers? The rest of the auditorium was filled with software engineers building and operating the infrastructure to power this research.

Finding the answers to these questions requires years (or decades) of research producing a lot of data.

Here are a few of the organizations who presented not only their research use cases but also how OpenStack powers various workloads, including high performance computing (HPC):

  • The Large Hadron Collider (LHC) at CERN is a new frontier in energy and data volume with its experiments generating up to 88 petabytes per year in Run 2. A community of over 12,000 physicists stationed all over the world are using the LHC to answer the big questions about the evolution of the universe and rely on its computing infrastructure for this data analysis. This includes an OpenStack deployment of almost 300,000 cores across three data centers that include projects like Ironic for managing hardware and Magnum for managing their growing Kubernetes environment.
  • The Square Kilometre Array (SKA) is a collaborative effort among organizations from 13 countries to design the world’s largest radio telescope that can measure phenomenons like how galaxies merge to create stars and changes in the space time continuum by tracking stars. A project spanning a total of 50 years, the SKA project has a significant data challenge as it’s ingesting over 700 gigabytes of data per second. Stig Telfer, CTO of StackHPC discussed how OpenStack could be used to handle this HPC use case, emphasizing the importance of collaborative efforts like the Scientific SIG to advance this research.
  • The NASA Center for Climate Simulation provides computing resources for NASA-sponsored scientists and engineers. Project supported by their OpenStack environment include the Arctic Boreal Vulnerability Experiment (ABoVE), High Mountain Asia Terrain (HiMAT), and its Laser Communications Relay Demonstration (LCRD) Project. Benefits of the OpenStack private cloud deployment supporting this research includes data locality, its advantage of being a better platform for lifting and shifting traditional science codes, and OpenStack APIs provide a unifying vision for how to manage datacenter infrastructure so they can avoid creating unicorn environments.
  • With goals like transforming the research landscape for a wide range of diseases, sequencing the genomes of 25 unsequenced UK organisms, and sequencing the DNA of all life on Earth in 10 years, the Wellcome Sanger Institute supports projects with data intensive requirements. Their open infrastructure environment integrates OpenStack, Ceph, and Ansible to address their HPC use case.

“The many OpenStack deployments across multiple scientific disciplines demonstrates the results of shared design, development and collaboration,” says Tim Bell, compute and monitoring group leader, IT department at CERN, adding that the ties between open source and open science, both built on large international communities, were a recurring theme at the event.”

In the afternoon, it was time to talk vGPU and FPGA support, OpenStack and Kubernetes integration and distributed tracing providing a different perspective with OpenStack contributors sharing how the community is evolving the software to improve support for such data intensive use cases.

The use cases and technical talks illustrated the need for cross community, open collaboration without boundaries, the theme of the welcoming keynote by Jonathan Bryce, executive director of the OpenStack Foundation. Despite the complex nature of the questions that this brain trust is trying to solve, the infrastructure challenges they face are similar to those of organizations in telecom, finance or retail.

“It’s pretty interesting how even though it’s a wild use case and with all of the crazy science they do, they face the same challenges like shared file systems, scaling and OpenStack upgrades,” said Mohammed Naser, VEXXHOST CEO who presented about OpenStack vGPU support.

“We have a lot of work to do to move [research] forward, but working together will make it easier,” Bell said.

Maybe you’ll even start to find answers while walking through the hallways of the world’s largest particle physics laboratory.

Cover photo: © 2019 CERN