CycleCloud 6 feature: MPI optimizations

This post is one of several in a series describing features introduced in CycleCloud 6, which we released on November 8.

Batch workloads have long been a natural fit for cloud environments. Tightly-coupled workflows (e.g. MPI jobs) are sensitive to bandwidth, latency, and abruptly-terminated instances. MPI workloads can certainly be run on the cloud, but with guardrails. CycleCloud 6 adds several new features that make the cloud even better for MPI jobs.

MPI jobs can’t make use of a subset of cores; they need all-or-nothing. CycleCloud now considers the minimum core count necessary for the job and sets the minimum request size. In other words, if the provider cannot fulfill the entire request, it won’t provision any nodes. Similarly, CycleCloud 6 also adds support for Amazon’s Launch Group feature, which provides all-or-nothing allocation for spot instances. This opens the spot marketing to MPI jobs, which can represent significant per-hour savings.

To address the latency concern, CycleCloud now dynamically creates AWS Placement Groups for MPI jobs. This groups instances logically nearby, minimizing latency.

At SC16? Stop by booth #3621 for a demo!

Share this: