CycleCloud Achieves Ludicrous Speed! (Utility Supercomputing with 50,000-cores)

Update: Since publishing this blog entry, our 50,000 core CycleCloud utility supercomputer has gotten great coverage by BusinessWeek, TheRegister, the NY Times, the Wall Street Journal’s CIO Report, Ars Technica, TheVerge, among many others. And now it would run for $750/hr with the AWS spot pricing as of 6/22/2012! Click here to contact us for more information… By now, we've shown that our software is capable of spinning up cloud computing environments that run  at massive scale and produce real scientific results.  After some of our previous efforts, we realized we were onto something with the CycleCloud Cloud HPC and Utility Supercomputing concept. However, even we underestimated the scales researchers would want to use and the scope of the research that this would impact.  Among the requests were some from a leader in computational chemistry research, Schrodinger. In collaboration with Nimbus Discovery, they needed to virtually screen 21 million molecule conformations, more than ever before, against one possible cancer target using their leading docking application, Glide. And they wanted to do it using a higher accuracy mode early-on in the process, which wasn’t possible before because it is so compute intensive! This is exactly what we did with our latest 50,000 core utility supercomputer that CycleCloud provisioned on Amazon Web Services, code-named Naga.  And Schrodinger/Nimbus got useful results they wouldn't have seen without utility supercomputing. We will describe how we accomplished this below, and in future articles and future blog posts. From a scale perspective, the most revolutionary concept implemented for Naga was scaling out all the components of an HPC environment. In our previous megaclusters, we performed a great deal of optimization...

Mad Scientist could win CycleCloud BigScience Challenge…

Just kidding, he's just a potential finalist! 😉 As some of you may know, Cycle wants to help scientists answer big research questions that might help humanity by donating compute time using our utility supercomputing softare. But in the overwhelming response we've gotten to the CycleCloud BigScience Challenge we announced last week, we repeatedly get the question, "What kind of research benefits humanity?" And the answer isn't Dr. Evil researching "sharks with frickin' laser beams"! Let's highlight a couple of the entries already received that might move us forward: There is the researcher doing quantum mechanics simulations for materials science to improve solar panel efficiency that might help "electrify 2.5 Billion people" with greener energy. Or the computational biologist that wants to use meta-genomics analysis to create a knowledgebase indexing system for stem cells and their derivatives, helping us "speed development of personalized cell-based therapies". Very exciting! Maybe you analyze public government data to provide clarity. Or you research science that might help in the race to treat Alzheimer's, Cancer and Diabetes. Or you're simulating ways to more efficiently distribute food in places that need it. There's plenty of utility supercomputing applications ahead of us that could benefit humanity, and now's your chance to start. Remember entries are due November 7th. So come join us. There's just four questions between you and the equivalent of 8 hours on a 30000 core cluster. So submit early & submit often, and let's change the speed that BigScience gets done! Jason StoweCEO, Cycle...

Fast and Cheap, pick two: Real data for Multi-threaded S3 Transfers

Gentleman start your uploads! They're free now but how fast can we do them? Lately we’ve been working with clients solving big scientific problems with Big Data (Next Generation Sequencing analysis is one example) so we’ve been working hard to transfer large files into and out of the cloud as efficiently as possible. We’re optimizing two costs here: money and time. Lucky for us, Amazon Web Services continues to drive down the costs of data transfer. We were excited to see that all data transfer into AWS will be free as of July 1st! They’re also reducing the cost to transfer data out of AWS. Less money, more science, yes! We still need to optimize for time, however. The scalability of the Elastic Compute Cloud (EC2) means we can throw as many cores at a scientific problem as we can afford in a very short time. But what if our input or result data is so large that the time to transfer it far outweighs the time to analyze it? Our previous work has shown that file transfers often do not fill the pipe to capacity, and are often limited by disk I/O and other factors. Therefore, we can speed transfers by using multiple threads to fill the pipe.   As shown above, this work involved moving data directly to a file system using rsync. But since that time, we’ve begun to rely upon the Simple Storage Service (S3) as both a staging area and long-term storage solution for input and result data. S3’s availability and scalability are far superior than even striped Elastic Block Store volumes running on...

Single click starts a 10,000-core CycleCloud cluster for $1060/hr

Update: This cluster received great coverage, including Amazon CTO Werner Vogel's kind tweet, customer commentary on this Life Science cloud HPC project, & results from our EC2 HPC Cluster. Meet our latest CycleCloud cluster type, Tanuki. Created with the push of a button, he weighs in at a hefty 10,000 cores. Yes, you read that right. 10,000 cores. Tanuki approximates #114 on the last 2010 Top 500 supercomputer list in size, and cost $1060/hr to operate, including all AWS and CycleCloud charges, with no up front costs. Yes, you read that right. 10,000 cores costs $1060/hr. Here are some statistics on the cluster: Scientific Need =  80000 Compute Hour Cluster Scale =  10k cores, 1250 servers Run-time =  8 hours User effort to start =  Push a button Provisioning Time =  First 2000 cores in 15 minutes, All cores in 45 minutes Upfront investment =  $0 Total Cost (IaaS & CycleCloud) =  $1060/hr This historic supercomputer, built completely in the cloud, drew its first breath minutes after the push of a button. Tanuki started operations through a completely automated launch using our CycleCloudSM service. It ran for 8 hours before the job workflow ended and the cluster was shutdown. The 8-hour run-time across 10000 cores yielded a treasure trove of scientific results for one of our large life science clients. The ability to run a cluster of this size for $1060/hr, including AWS and CycleCloud charges, is mind-boggling, even to those of us that have been in the cloud HPC business for a while. When Tanuki was first mentioned within Cycle, its scale was thrown out partly as a...

Lessons learned building a 4096-core Cloud HPC Supercomputer for $418/hr

The Challenge: 4096-core Cluster Back in December 2010, we discussed running a 2048-core cluster using CycleCloud, which was in effect renting a circa 2005 Top 20 supercomputer for two hours. After that run, we were given a use case from a client that required us to push the boundary even further with CycleCloud. The challenge at hand was running a large workflow on a 4096-core cluster, but could our software start and resolve issues in getting a 4096-core cluster up and running?   Cycle engineers accepted the challenge and built a new cluster we’ll call “Oni”. The mission of CycleCloud is to make running large computational clusters in the cloud as easy as possible. There is a lot of work that must happen behind the scenes to provision clusters both at this scale and on-demand. What kinds of issues did we run into as we prepared to scale out the CycleCloud service from building 2048-core cluster up to a whopping 4096-core Oni cluster?  This post covers three of these questions: Can we get 4096 cores from EC2 reliably? Can the configuration management software keep up? Can the scheduler scale? How much does a 4096-core cluster cost on CycleCloud?   Question 1: Can We Get 4096 Cores from EC2 Reliably? We needed 512 c1.xlarge instances (each with 8 virtual cores) in EC2’s us-east region for this workload. This is a lot of instances! First, we requested that our client’s EC2 instance limit be increased. This is a manual process, but Cycle Computing has a great relationship with AWS and we secured the limit increase without issue. However, an increased instance...

HowTo: Save a $million on HPC for a Fortune100 Bank

In any large, modern organization there exists a considerable deployment of desktop-based compute power. Those bland, beige boxes used to piece together slide presentations, surf the web and send out reminders about cake in the lunch room are  turned on at 8am and off at 5pm, left to collect dust after hours. Especially with modern virtual desktop initiatives (VDI), thin clients running Linux are left useless, despite the value they hold from a compute perspective. Fortune 100 Bank Harvesting Cycles Today we want to educate you about how big financial services companies use desktops of any type to perform high throughput pricing and risk calculations. The  example we want to leverage is from a Fortune 100 company, let's call them ExampleBank, that runs a constant stream of moderate data and heavy CPU computations on their dedicated grid. As an alternative to dedicated server resources, running jobs on desktops was estimated to save them millions in server equipment, power and other operation costs, and London/UK data center space, thanks to open source software that has no license costs associated with it! Cycle engineers worked with their desktop management IT team to deploy Condor on thousands of their desktops, all managed by our CycleServer product. Once deployed, Condor falls under control of CycleServer and job execution policies are crafted to allow latent desktop cycles to be used for quantitative finance jobs. Configuring Condor Condor is a highly flexible job execution engine that can fit very comfortably into a desktop compute environment, offering up spare cycles to grid jobs when the desktop machine is not being used for its primary role. Our...