gce
Google has officially thrown its gauntlet into the IaaS cloud computing ring by opening up access to the Google Compute Engine (GCE) service to the general public. One of the differentiating features touted by Google is the performance of its networking infrastructure.

We decided to take the service for a quick spin to see what the interconnect performance was like within the context of the HPC application domain. In particular, we were interested in measuring the latency between two machines in an MPI cluster.

For our test, we spun up two instances, setup an OpenMPI cluster, and then ran the osu_latency benchmark from the OSU Micro-Benchmarks test suite to measure the amount of time it takes to send a 0-byte message between nodes in a ping-pong fashion. The numbers reported below are the one-way latency numbers averaged over 3 trials. A new pair of machines was launched for each trial.

Instance Type Trial #1 Trial #2 Trial #3 Average
n1-standard-1 183.12 172.57 169.90 175.20
n1-standard-2 192.27 202.51 196.20 196.99
n1-standard-4 169.97 170.96 177.03 172.65
n1-highcpu-2 176.34 210.81 192.04 193.06
n1-highcpu-4 205.00 176.11 159.95 180.35
n1-highmem-2 176.80 177.73 189.72 181.42
n1-highmem-4 173.78 175.94 185.85 178.52

*all latency numbers measured in microseconds

The reported latency numbers are roughly the same for all of the instance types we tested. The variance between tests is likely due to contention from other tenants on the machine. Benchmarking cloud compute instances is a notoriously tricky problem. In the future, we’ll look at running a more exhaustive test across more instances and over different time periods.

As a point of comparison, we see latencies between 70-90 microseconds when running the same test with Amazon EC2 instances. It is important to point out that this is not a true apples-to-apples comparison: Amazon offers special cluster compute instance types as well as placement groups. The latter allows for better bandwidth and reduced latencies between machines in the same group. The GCE latency numbers appear to be closer to what Edward Walker reported for non-cluster compute instances on EC2. It appears likely that Google is focusing on the more typical workload of hosting web services for now and will eventually turn their focus towards tuning their infrastructure for other domains such as HPC. At the moment, it seems like GCE is better suited for workloads that are more “embarrassingly parallel” in nature.

It should be noted that these types of micro benchmarks do not necessarily represent the performance that will be seen when running real-world applications.  We encourage users to perform macro-level, application-specific testing to get a true sense of the expected performance. There are several ways to mitigate latency penalties:

  • For certain classes of simulation problems, it may be possible to decompose models into separate pieces that can then be evaluated in parallel. A shift in thinking is required with the advent of the public cloud. Rather than having a single on-premise cluster, it is possible to launch many smaller clusters that can operate over the decomposed pieces at the same time.
  • Leveraging hybrid Open MP / MPI applications when possible. Reducing the amount of chattiness between cluster nodes is an excellent approach for avoiding latency costs altogether.

We look forward to seeing the continued arms race amongst the various cloud providers, and expect that HPC performance will continue to improve.  As an example, Microsoft has recently announced a new HPC offering for Azure that promises Infiniband connectivity between instances. As in most cases, competition between large cloud computing providers is very good for the end customer. At Rescale, we are excited about the opportunities to continue providing our customers with the best possible performance.

This article was written by Ryan Kaneshiro.

molecules

This is a guest post by a Rescale customer, Eric Lee, discussing his first Rescale experience.

Hello everyone! My name is Eric Lee, and I am a graduate student at the University of Kansas. At KU, my research focuses on nanofluidics, specifically nanowetting problems. Nanofluidics is the study of nanoflows in and around nanosized objects. At the nanoscale, physical behaviors that are not observed in larger structures, such as dispersion forces, thermal fluctuations, and hydrodynamic slip, become very important. Nanofluidics is the basis for miniaturization of microfluidic devices down to the nanoscale, such as the invention of lab-on-a-chip devices for PCR and related techniques.

As part of my research, I use some molecular dynamics codes, such as LAMMPS, in order to run computational analyses. LAMMPS is a classical molecular dynamics code, distributed by Sandia National Laboratories and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. It runs on single processors or in parallel using message-passing techniques and a spatial-decomposition of the simulation domain. The code is designed to be easy to modify or extend with new functionality.

In order to take full advantage of the scalability of LAMMPS, so that I could run more analyses faster and cycle through various results quickly, I needed access to a large computing cluster that could work immediately with LAMMPS. I heard about Rescale through a friend, who recommended that I try using it for my experiments.

I noticed that Rescale supported LAMMPS and many other software codes across domains, including any custom scripts that users wanted to upload. In addition, their hardware appeared to be state-of-the-art, and I wondered how much faster my LAMMPS jobs could run on Rescale.

I had some initial questions about how Rescale worked, and whether it could support my specific version of LAMMPS. Initially, Rescale didn’t include some custom LAMMPS packages necessary for my jobs, but their engineers got the packages running very quickly. I appreciated their quick responses to all my concerns – Rescale support usually got back to me within a few hours. They also sent me some very useful screenshots showing how I could set up my own LAMMPS job.

lammps_mpi_workflow

Screenshot of LAMMPS MPI workflow on Rescale

The process to run a job on Rescale is really simple. The main steps are:

(1) Set up a name for the simulation
(2) Choose number of cores needed based on estimated simulation steps (the more cores, the less simulation time usually)
(3) Upload the input file and associated data files
(4) Choose serial / parallel analysis type
(5) Submit the job!

I was very impressed with the runtime performance. My LAMMPS simulations running on my local machines usually take about 4 days to finish. On Rescale, the same analyses finished in less than 12 hours. This massive productivity gain helped my research efforts tremendously. I was really happy with this, and I am definitely looking forward to running more jobs on Rescale. Until next time!

Eric Lee is a graduate student at the University of Kansas

This article was written by Eric Lee.

Image tailing

Visualization of air velocity around aircraft landing gear

What is ‘live tailing’? Why did you build it?

The solvers in modern simulation codes in applications such as CFD, FEA and molecular dynamics are becoming more sophisticated by the day. While taking advantage of (i) new technologies in hardware and (ii) advances in numerical methods, many of these solvers require close monitoring to ensure they converge to a useful and correct solution. It is important to know when a simulation reaches an un-desired state so it can be stopped and the problem can be diagnosed.

At Rescale, we heard consistent feedback from our customers that they wanted to track the status of their jobs real-time. In response, we have recently added a powerful new feature to the platform that enables comprehensive monitoring in an easy and efficient way.

We call this feature ‘live tailing’.

Live tailing allows Rescale customers to monitor any output file for jobs running on the cluster with just one click. This feature replaces the currently painful process of dealing with ssh keys, logging into the cluster, and / or deciphering where the relevant files are located on the server. Rescale’s live tailing is intuitive, user-friendly, highly secure, and much more efficient than traditional monitoring.

How does it work?

Once a customer submits a job, they can go to the Status page, where a list of active runs is displayed. Clicking on one of these runs will display all the files related to that particular job. Customers can scroll through the list or even text-search for a specific file. Clicking on the name of the desired file will display the user-specified number of lines for that particular file.

live_tailing_screenshot

Live tailing section in relation to the Status Page

Why is it useful?

As engineers, we recognize how important it is to track the status of any analysis at any time. Here are some examples of useful applications for live tailing:

  • Monitor progress of a simulation, either to extrapolate total expected runtime or to ensure that the simulation doesn’t enter a negative state.
  • View output plots to quickly analyze important trends and metrics of the simulation.
  • Monitor load balancing for parallelized simulations to diagnose inefficient behavior and to help the customer choose the correct number of processors.
  • Monitor time step conditions such as CFL or adaptive grid conditions to ensure that the simulation doesn’t “blow up.” Simulations that creep along and blow up in time or size can now be stopped quickly.

Does live tailing work with image files as well?

Yes. Some simulation codes are able to generate image files such as meshes, graphs or surface plots. These files can be live tailed as well. Clicking on a file that is a jpg, png or gif will display the image right inside the browser. Check out this aircraft landing gear example using Gerris (http://gfs.sourceforge.net/wiki), an open-source CFD code, with data provided by the AIAA.

Screen Shot 2013-05-14 at 3.35.05 PM

Live tailing allows displaying analysis-generated images

How can I try it?

Contact us at support@rescale.com – we can share existing jobs with you so you can see how it works.

This article was written by Mulyanto Poort.