“We started using OpenStack because we saw the value of segregating hardware resources and the opportunity to isolate and simplify the infrastructure configuration,” Manuel Sopena Ballesteros says.

image

SYDNEY — The Garvan Institute of Medical Research adopted OpenStack to better serve its users, data scientists and researchers. OpenStack and scientists have always gone together, from analyzing far away galaxies or exploring the very basic pieces of time and space.

At Garvan, the science is genomics, a multi-disciplinary branch of molecular biology.  Genomics also involves the sequencing and analysis of genomes — and that’s where OpenStack comes into play.

“We started using OpenStack because we saw the value of segregating hardware resources and the opportunity to isolate and simplify the infrastructure configuration to fit the needs of each individual project,” Manuel Sopena Ballesteros, a big data engineer at the Institute, says.

OpenStack is helping Garvan, one of Australia’s largest medical research institutions and in the top five worldwide for genome research, provide innovative solutions for analyzing genomic data to both its genome sequencing center and its 80+ data scientists. One of the most effective slides shown by Sopena Ballesteros at the Summit charted the cost of genome sequencing and Moore’s law.

The cost of genome sequencing plummeted from $100 in 2001 to around $1,000 in 2016. The result: a huge amount of data is  now generated at a much lower cost—but the cost of processing power is not decreasing as quickly.

The Garvan Institute has been using an HPC cluster for its genomics applications. However, this legacy infrastructure is not ready to keep up with the increasingly large data sets. The HPC system, while providing high performance and hardware efficiency, is unfortunately fairly rigid: not all the tools can be adapted to the HPC environment. The Data Intensive Computer Engineering (DICE) group at Garvan implemented OpenStack using Kolla-Ansible to deploy hyper-converged environments based on Docker containers. The new cloud-based environment reduced the time and the number of servers needed for computing and storage. OpenStack allowed decoupling hardware from software to get granular resource allocation, use bare metal, VM, containers, shared storage and commodity hardware. Ultimately, the group hopes to reduce cost and eventually migrate the HPC system to OpenStack, too.

You can check out the entire 35-minute presentation below.

 

Cover Photo // CC BY NC

Superuser