Getting started with MS-MPI on Azure


It has been about a year and a half since we released a reusable Azure Cloud Service for provisioning a simple Windows MS-MPI cluster without having to install HPC Pack. Azure has undergone a lot of changes since that time and we thought it would be worth revisiting this topic to see what the current landscape looks like for running Windows MPI applications in the cloud.

First, Cloud Services and the so-called “IaaS v1 Virtual Machines” have been relegated to “Classic” status in the Azure portal. Microsoft now recommends that all new deployments use Azure Resource Manager (ARM). Azure Resource Manager allows clients to submit a declarative template written in json that defines all of the cloud resources such as VMs, load balancers, and network interfaces, that need to be created as part of an application or cluster. Dependencies can be defined between resources and the Resource Manager is smart enough to parallelize resource deployment where it can. This can make deploying a new cluster or application much faster than the old model. Azure Resource Manager is essentially the equivalent of CloudFormation on AWS. There are some additional niceties here though such as being able to specify loops in the template. Dealing with conditional resource deployment however, is clunkier in ARM templates than in CloudFormation. Both services suffer from trying to support programming logic from within json. All in all however, ARM deployments are much easier to manage than Classic ones.

The Azure Quickstart Templates project on Github is a great resource for finding ARM templates. Deploying an application is literally as simple as clicking a Deploy to Azure button and filling in a few template parameter values. On the HPC front, there is a handy HPC Pack example available that can be used to provision and setup the scheduler.

However, as we touched on in our original blog post, using HPC Pack may not be the best choice if you are getting started with MPI and simply want to spin up a new MPI cluster, test your application, and then shut everything back down again. While HPC Pack provides the capabilities of a full blown HPC scheduler, this additional power comes at the cost of some resource overhead on the submit node (setting up Active Directory, installing SQL Server, etc). This can be overkill if you just want a one-off cluster to run a MPI application.

Another potentially lighter weight option for running Windows MPI applications in the cloud is the Azure Batch service. Recently, Microsoft announced support for running multi-instance MPI tasks on a pool of VMs. This looks to be a useful option for those who interested in automating the execution of MPI jobs however it does require some investment of developer resources to become familiar with the service before MPI jobs can be run.

We feel there is a still room for an Azure Resource Manager template that 1) launches a bare-bones Windows MPI cluster without the overhead of HPC Pack and 2) allows MPI jobs to be run from the command line or a batch script from any operating system.

On that second point above, another interesting development since our original post is that Microsoft has decided to officially support SSH for remote access. Since that announcement, the pre-release version of the code has been made available on GitHub.

So, given those pieces we decided to put together a simple ARM template to accomplish both of those goals. For someone getting started with MS-MPI, we feel this is a simpler option to getting your code running on a Windows cluster in Azure.

Here is a basic usage example:

  1. Click the Deploy To Azure button from the Github project. Fill in the template parameters. Here, a 2-node Standard_D2 cluster is being provisioned:

  2. Make a note of the public IP address assigned to the cluster when the deployment completes.
  3. The template will enable SSH and SFTP on all of the nodes. Upload your application to the first VM in the cluster (N0).  Here we are using the hello world application from this blog post.
  4.  SSH into N0, copy the MPI binary into the shared SMB directory (C:\shared), and run it. Enter your password as the
    argument to the -pwd switch (redacted below). The -savecreds command line argument will securely save your credentials on
    the compute nodes so you don’t have to specify the password in future mpiexec calls. See
    here for more details.

And that’s it! For those that are more GUI-inclined, RDP is also opened up to all of the instances in the MPI cluster. Head on over to the Github project page for more details.

This article was written by Ryan Kaneshiro.