Life Sciences

Increasingly, computational horsepower is a key component to any life science research project. Genomic sequencing, computational chemistry, molecular dynamics, clinical trial analysis, or many other disciplines need to simulate experiments, process data, and analyze results quickly and easily. Access to the right kind of resources at the right time and at the right cost can be a huge challenge.

Cycle Computing software combined with cloud-based compute and storage gives researchers and analysts the power needed – simply, efficiently, and effectively. Cycle Computing’s CycleCloud software suite is the leading cloud orchestration, provisioning, and data management platform for life science computing applications running on any cloud or internal environment.

Life science projects often require multiple applications and multiple data sets functioning as a complete workflow or workflows. Running workflows in the cloud, without tools, can quickly become overwhelming. Starting and stopping applications and clusters, data placement, insuring security, managing budgets and leveraging market pricing, and more involves many details and manual coordination. Leveraging the CycleCloud tool suite solves these issues plus more for users, system administrators and management, while delivering the maximum value out of any cloud workflow.

 

Broad Institute Example:

The Broad Institute’s Cancer Program has data sets that include hundreds of cancer cell lines, information on the genetic mutations present in each cell line, gene expression data showing which genes are more or less active under various conditions, as well as, information about how various small molecules interact with the cell lines at both large and small scales.

 

Broad Overview Slide

 

One of the Cancer Program’s goals is to intelligently direct future research using these datasets. This particular workload used machine learning techniques to infer relationships among and between these cell line and gene/expression data sets.

These machine learning algorithms require a lot of compute power. To build this map for only several hundred samples on a single CPU would have required decades of computing. Even with the extensive, but finite, resources at the Institute, it posed such a sufficiently daunting computational effort that researchers found themselves holding back from running certain calculations, since prioritizing and scheduling such an effort would have required coordination across many groups.

With the use of CycleCloud software and leveraging Google Cloud Platform (GCP), the team automated the creation of the cluster environment and submitted the workloads to autoscale a 51,200 core cluster. CycleCloud executed 3 decades of cancer research in an afternoon, on a petascale computer for less than the cost of a single server, instead of months on local computers. The cluster ran the 340,891 jobs using Ubuntu images, with a shared file system, and the Univa Grid Engine scheduler.  Learn more about this effort.

Beyond the above, Cycle Computing has helped life science users leverage cloud resources effectively across a wide range of disciplines including:

  • Computational chemistry
  • Quantum Chemistry
  • Bioinformatics
  • Genomics
  • Proteomics
  • Molecular Dynamics 
  • Clinical trials simulations
  • PK/PD (nonmem) 
  • Data conversion
  • RNA Sequencing
  • Genome-wide Association Studies (GWAS)
  • Machine learning

Learn more about Cycle Computing’s CycleCloud software suite

 

1695