At CERN, storage is the key to the universe

It takes a lot of space to unravel the mysteries of the universe. Just ask the operations team at European particle physics laboratory CERN, who face the weighty task of storing the data produced at the Large Hadron Collider (LHC.)

After a two-year hiatus, scientists recently fired up the 17-mile circular collision course that lies under a bucolic patch of the French-Swiss border near Geneva. It’s shaping up to be a great run – they’ve already broken a speed record. They’ll be investigating such lofty subjects as the early universe, dark matter and the Higgs boson or “God particle”— but there’s a reason the real action at CERN was dubbed “turning on the data tap.”

New event display from yesterday's #13TeV collisions with stable beams. For more information: https://t.co/o3aft94xxX pic.twitter.com/Wmm7CM5sGv

— ATLAS Experiment (@ATLASexperiment) June 4, 2015

The data storage required to keep pace with the world’s top physicists puts the “ginormous” in big data. For starters, the data gets crunched before protons ever whirl around the circuit. Researchers at four particle detectors (ATLAS, CMS, LHCb, ALICE) first simulate what should happen according to the standard model of physics, then fire up the collider, observe the smashups and compare results from both.

“If it’s different, that means there’s some new physics there, some new particle or something that we didn’t understand,” said Dan van der Ster of the CERN IT data and storage group. “Getting significance in these two steps requires a lot of data. In the simulation step there are CPUs churning out petabytes of Montecarlo data but then the real data is also a on petabyte scale.”

Yesterday, a team at #CERN built, for the first time, a new, medium-sized design of the ATLAS detector in #LEGO! pic.twitter.com/jvQylAVo8l

— ATLAS Experiment (@ATLASexperiment) June 18, 2015

He gave a run-down of how CERN is using distributed object store and file system Ceph at the OpenStack Summit Vancouver, including Linux tuning tips, thread-caching malloc latency issues, VM boot failures and a mysterious router failure that, however, had no data corruption reported and no data scrub inconsistencies. (Whew!) (A video of his 40-minute talk is available on YouTube.)

First, the creation story: CERN moved all IT infrastructure to OpenStack into virtual machines last year, and they’ve used it in production since summer 2013. Currently, all the IT core services are on OpenStack and Ceph, and most of the research services are, too. (One exception: CERN’s big batch farms are still not virtual.) The storage engineer also provided some mind-expanding numbers: they’ve currently got close to 5,000 hypervisors, 11,000 instances,1,800 tenants and roughly that number of users.

After evaluating Ceph in 2013, the research center deployed a 3-petabyte cluster for Cinder/Glance. CERN has shared some of its OpenStack set-up in other talks: “Deep dive into Cern cloud infrastructure,” “Accelerating science with Puppet,”“Running Compute and Data at Scale.”

“We picked Ceph because it had a good design on paper, it looked like the best option for building block storage for OpenStack,” he said. “We called Ceph our organic storage platform, because you could add remote servers with no downtime, ever.” They ran a 150-terabyte test that had all flags flying, so they went ahead and deployed it. Initially deployed with Puppet using eNovance’s Ceph module, but today they use the Ceph disk deployment tool, so it’s “kind of customized,” he added.

Upgraded a @ceph cluster running #CephFS with 80TB real data to Hammer without downtime! 720 OSDs, 3 MONs and 2 MDS (Active/Standby).

— Wido (@widodh) April 30, 2015

Everything at CERN is super-sized. “Our requirements are to grow by 20 to 30 petabytes per year,” he said. One of the challenges they faced is that because it’s a research lab, everything’s effectively “free” for the users. That means that scientists are constantly pushing for more input/output operations per second (IOPS). “We had no objective way to decide if the user gets more IOPS,” so he says they now have to prove they have a performance problem before cranking that capacity up. To push boundaries even further, for a couple of weeks van der Ster borrowed 150 servers of 192- terabytes each and it worked — “with some tuning,” he said.

“For less than 10-petabyte clusters, Ceph just works, you don’t have to do much,” van der Ster said.

You can keep up with the latest on in OpenStack at CERN through this blog: http://openstack-in-production.blogspot.com/

Cover Photo by Marceline Smith // CC BY NC