Azure MS-MPI in a Box

azure-mpi

One of the things that we touched upon in an earlier blog post is the relative difficulty in setting up a Microsoft MPI cluster on Windows Azure because of the HPC Pack installation requirement that the official documentation recommends. As we discovered, it is possible to manually install and configure a Microsoft MPI cluster without HPC Pack, but this process is not well-documented.

Today, we are happy to release a self-contained Cloud Service package that installs and configures an InfiniBand Microsoft MPI cluster for Azure PaaS. We feel that this “MPI-in-a-box” functionality makes it much easier to spin up a cluster for one-off purposes without needing to make the investment in installing and maintaining a full HPC Pack deployment.

To spin up a connected cluster of A9 instances, you can simply download and deploy the pre-built package here and tweak a few settings in the accompanying .cscfg file. The package contains a startup script that will install and configure Microsoft MPI on the nodes. The startup script will handle the details around opening up the firewall ports for inter-node communication and building out a basic machinefile that can be used in mpiexec calls. In addition, the script will install a Cygwin OpenSSH server on every role instance in order to access the cluster remotely.

You will also need to configure a few values in the .cscfg file:

First, make sure to specify the number of A9 instances that should be launched in the Instances element. Next, at a minimum, you’ll need to provide values for the adminuser.publickey and jobuser.publickey settings so you can login to the machines after they boot. The different ConfigurationSettings are listed below:

adminuser The name of the user that will be created in the Administrators group
adminuser.publickey The SSH public key for the adminuser. Added to the ~/.ssh/authorized_keys list.
jobuser The name of the less-privileged user that will be created in the Users group. This is the user that should run mpiexec.
jobuser.publickey The SSH public key for the jobuser. Added to the ~/.ssh/authorized_keys list.
blob.storageurl The startup script will download programs from this location when booting up. The MS-MPI and Cygwin distributables are located here. Rescale hosts the necessary files so you shouldn’t need to modify this.

Once you’ve filled out the values in the .cscfg file, you can deploy the service through the Azure Management web page or script it out with the Management API.

After the instances are up and running, you can use SSH to connect to each of the role instances. The Cloud Service is setup to use Instance Internal Endpoints to allow clients to connect to individual role instances through the load balancer. The OpenSSH server running on port 22 on the first role instance is mapped to the external port 10106. The OpenSSH server on the second role instance is mapped to 10107, the third to 10108 and so on.

So, if you deployed a cloud service called foobar.cloudapp.net, in order to login to the first to the first role instance in your cluster, you’ll want to use a command like:

ssh -i [jobuser-private-key-file] -p 10106 jobuser@foobar.cloudapp.net

SCP can be used to transfer files into the cluster (though note that you’ll need to use -P instead of -p to specify the custom SSH port).

The startup script will launch the SMPD process on all of the machines as the user that is specified in the jobuser setting. This means that you will need to make sure to log in as this user in order to run mpiexec.

A machinefile is written out to the jobuser’s home directory, which can be used in the mpiexec call. For example, after SSHing into the first role instance as the jobuser, the following command will dump the hostnames of each machine in the cluster:

$ mpiexec -machinefile machinefile hostname
RD00155DC0E6D8
RD00155DC0BEB3

Finally, the startup script will also configure a basic Windows SMB file share amongst all the nodes in the cluster. The jobuser can access this folder from the ~/work/shared path. This is an easy way to distribute files amongst the nodes in the cluster. Please note however, that you will likely see better performance if you use something like azcopy to have each node download their input files from blob storage instead.

The source code for the Cloud Service is available on Github. Feel free to fork and contribute.

MPIClusterService.zip

This article was written by Ryan Kaneshiro.