Despite significant improvements over the years, the same criticisms still color people’s opinion of using cloud environments for high performance computing (HPC). One of the most common things to hear when talking about using Amazon’s Elastic Compute Cloud (EC2) for HPC is “Sure, Amazon will work fine for pleasantly parallel workloads, but it won’t work for MPI (Message Passing Interface) applications.” While that statement is true for very large MPI workloads, we have seen comparable performance up to 256 cores for most workloads, and even up to 1024 for certain workloads that aren’t as tightly-coupled. Achieving that performance just requires some careful selection of MPI versions and EC2 compute nodes, along with a little network tuning.
Note: While it is possible to run MPI applications in Windows on EC2, these recommendations focus on Linux.
The most important factor in running an MPI workload in EC2 is using an instance type which supports Enhanced Networking (SR-IOV). In a traditional virtualized network interface, the hypervisor has to route packets to specific guest VMs and copy those packets into the VM’s memory so that it can process the data. SR-IOV helps reduce the network latency to the guest OS by making the physical NIC directly available to the VM, essentially circumventing the hypervisor.
Fortunately, all of Amazon’s compute-optimized C3 and C4 instance types support SR-IOV as long as they’re launched in a Virtual Private Cluster (VPC). For specific instructions on enabling SR-IOV on Linux instances, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html
Use of Placement Groups
Another important factor in running MPI workloads on EC2 is the use of placement groups. When instances are launched into a common placement group, they’re placed into the same low-latency 10Gbps network to improve bandwidth and latency between instances within that cluster.
There are a wide variety of MPI libraries available, and each are best-suited for different types of compute hardware, operating system, network interconnect, and even specific applications. Between MPICH2, MVAPICH, OpenMPI, Intel MPI, MS-MPI, LAM/MPI, and others, it’s hard sometimes to choose the appropriate libraries to use.
After extensive testing, we recommend two MPI libraries for use on EC2: Intel MPI for those who already have a license and OpenMPI for those looking for a free alternative. In general, these two implementations work best on EC2, but you may find your specific application runs better with another library.
The final piece of advice for running MPI applications in AWS is to ensure your network interfaces and TCP parameters are taking advantage of the higher bandwidth available while using enhanced networking. We typically run the following short script as a good starting point to increase the packet size and increase the TCP memory available to the network stack:
#!/bin/bash # Enable 9000 MTU /sbin/ifconfig eth0 mtu 9000# IBM MPI TCP tuning parameters # http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5086630 /sbin/sysctl -q -w net.ipv4.tcp_timestamps=0 /sbin/sysctl -q -w net.ipv4.tcp_sack=0 /sbin/sysctl -q -w net.core.netdev_max_backlog=250000 /sbin/sysctl -q -w net.core.rmem_max=16777216 /sbin/sysctl -q -w net.core.wmem_max=16777216 /sbin/sysctl -q -w net.core.rmem_default=16777216 /sbin/sysctl -q -w net.core.wmem_default=16777216 /sbin/sysctl -q -w net.core.optmem_max=16777216 /sbin/sysctl -q -w net.ipv4.tcp_mem="16777216 16777216 16777216" /sbin/sysctl -q -w net.ipv4.tcp_rmem="4096 87380 16777216" /sbin/sysctl -q -w net.ipv4.tcp_wmem="4096 65536 16777216"
These are just a few steps you can take to improve the performance of your MPI applications in EC2. While the cloud may not be suited to all parallel codes, the cost savings can be worth the effort in testing and benchmarking. Not only have we proven MPI can successfully be run on cloud, but with some forethought and the right tools, you can achieve the performance many of our customers need.