How do you download files from cloud storage in your custom cookbooks?

How do you download files from cloud storage in your custom cookbooks?

CycleCloud™’s Cluster-Init technology is a great way to copy static files onto a cloud instance.  Upload your files with the CycleCloud project feature and the automation will copy them down when your instance boots. But what happens if you need some flexibility? Perhaps you want to make the download conditional on a particular node attribute or state. Or maybe you want to use different versions of a package without having to maintain separate versions of your project. Using a custom Chef cookbook is a great way to handle it. But how do you download the file? Our Chef cookbooks include a lightweight resource provider (LWRP) called “Thunderball” that handles this. With Thunderball, you can use your credentials to access a cloud storage endpoint that has your packages. Thunderball provides reliability by ensuring the file is fully downloaded before completing. To use thunderball in your cookbook, you first set up the configuration. This example reads Amazon S3 credentials from a data bag named “s3_creds”. # Include Cycle Computing’s Thunderball recipe include_recipe "thunderball" # Read credentials from the databag creds = data_bag_item('s3_creds', 'creds') # AWS Access Key access_key = creds['access_key'] # AWS Secret Key secret_key = creds['secret_key'] # S3 bucket for packages package_bucket = creds['package_bucket'] node.default[:package_bucket] = package_bucket # Create a Thunderball config named “my_thunderball” thunderball_config "my_thunderball" do base "s3://#{package_bucket}" username access_key password secret_key end Now that you have created the credentials, you can use it to download packages elsewhere in your recipe. For example, if you have a Debian package of custom Python build you want to install: # Set the custom python version from the node attribute example.python.version python_version...

Cloud HPC won’t steal your job

The cloud is not coming to steal your job or the jobs of your team. But then again, what is your job? Here’s a hint: it’s furthering a particular business goal, not performing a specific and unchanging set of tasks. In our experience working with HPC users and admins from a variety of industries, there’s always more work to be done. There are new ways to compute, storage challenges, workflow challenges, scaling challenges. While our CycleCloud™ software makes simple, managed access to cloud HPC easier to provide, we’ve never seen anyone’s position get eliminated. Instead what happens is that staff are freed from mundane infrastructure labor and management tasks to do organization-specific work that adds value. The first question people ask when a new initiative comes along is “how will this affect my job?” so there’s always some trepidation to broach the topic. But avoiding the topic does our customers a disservice. Reducing the effort necessary to provide a cloud HPC environment to users is important for two reasons. The first is that HPC experience generally – and cloud HPC experience specifically – is a tight market. It can be hard to find and retain talented employees. We see two types of situations. In the first case, the staff tasked with running the cloud HPC environment do not have HPC experience. They may be talented sysadmins or user support staff who have been thrown to the wolves, so to speak. They need help getting started, managing the differences (and similarities) between cloud and in-house environments, understanding what to monitor, knowing how to help users route jobs to the best...

Cycle Computing at Bio-IT World

We’re excited to be back at Bio-IT World & Expo this week. Before the show opens, I wanted to share some details about how you can find us. First, you can win an awesome prize! Tweet a selfie from our booth or other signage around the conference and include @cyclecomputing and #bioit17 to be entered into the drawing. We’ll have it on hand in booth #361. While you’re in our booth, you can also get your picture taken in our photo booth. We’ll have several fun props. And of course, we’re there to do work, too. We just released the latest version of our CycleCloud™ software suite for providing simple, managed access to big compute and cloud HPC. As we wrote earlier this month, this release includes improved monitoring for GPU-powered instances. GPUs provide a great boost to many scientific workloads. Benchmarking done at the University of Illinois showed a many-fold increase in NAMD performance on GPUs. Stop by booth 361 to learn how CycleCloud makes it easy to get the GPU or CPU resources you want with the control you need. Jason Stowe, our CEO, will be delivering the keynote introduction at 8 am on Wednesday and presenting “How cloud has changed life sciences” in the Cloud Computing track at noon on...
Cloud-Agnostic Glossary

Cloud-Agnostic Glossary

No two cloud service providers are the same. This applies not only to the services they provide, but to what they call the services. At Cycle Computing, we spend a lot of time working with multiple cloud service providers; being able to abstract away small differences in providers is one of the compelling features of CycleCloud. Over the years, I’ve kept notes for translating concepts across the providers. I’d always assumed someone had put together a more thorough Rosetta Stone, but when I went looking for one recently, I couldn’t find it. Some websites compare two providers, but nobody has put the three major providers on the same page. I figured if I had looked, others are looking, too. I decided to take my notes and fill them out with some additional details. The result is our new Cloud-Agnostic Glossary. Of course, there’s only so much that can fit on two pages. I intentionally left out a lot of features because including every possible feature would fill a book. This glossary focuses on the features that are most relevant to provisioning big compute infrastructure. Of course, there’s a lot more detail for each of these services than can fit in a glossary. Feel free to download and share with anyone that you think can use this.   For more information about what the cloud service providers offer, see their documentation: * Amazon Web Services — https://aws.amazon.com/documentation/ * Google Cloud Platform — https://cloud.google.com/docs/ * Microsoft Azure —...
Monitoring cloud GPUs with CycleCloud

Monitoring cloud GPUs with CycleCloud

Graphics Processing Units (GPUs) provide a great boost for high performance computing, but they’re expensive and take time to purchase and install. With our CycleCloud software, you can get immediate access to just the right amount of cloud GPU time from Microsoft Azure, Google Cloud, and Amazon Web Services. GPU-enabled instances in CycleCloud enjoy the same features that traditional compute instances do: cost control, monitoring, and dynamic scaling. In our upcoming release, we’ve improved the monitoring experience, making it easier than ever to manage your cloud GPU instances. CycleCloud configures the monitoring automatically for GPU-enabled instances with drivers installed. You don’t need to do any of the setup yourself. When clicking Show Detail on a cloud nodes in the CycleCloud interface, you can now see performance graphs and statistics alongside the other node information. When the node has GPUs, this includes the GPU usage and memory. In addition, the detail window also includes a Metrics tab. This tab shows all of the raw performance metrics reported by the Ganglia system monitoring platform. If you’re interested in learning more, stop by booth #530 at the GPU Technology Conference this week for a demo, or contact...