The Fédération Internationale de l’Automobile (FIA) released the latest version of sporting and technical regulations for 2017 Formula One Grand Prix on April 29, 2016.  This document is the rulebook that all Formula One racing teams must follow in the 2017 season.  All of the restrictions of CFD simulations are clearly defined in Appendix 8 (Aerodynamic Testing Restrictions) – section 2 of the sporting regulations.

Computational Fluid Dynamics (CFD), a widely accepted methodology in automobile aerodynamics R&D, has been proven to speed up the turnaround time effectively. The biggest upside is that it doesn’t involve any part manufacturing and all proof of concepts (POC) can be done on computers.  In the high-end auto industry, such as sports and racing vehicle makers, CFD has been used even more intensively. In this blog post, I’ll illustrate why Formula One racing teams should leverage the cloud to advance their CFD designs and why FIA, as the governing body of the sport, would also benefit from pushing it forward.

Continue reading

This article was written by Irwen Song.


Three years ago we visited the Google’s IaaS service – Google Compute Engine (GCE) for its networking performance and Ryan posted the results in his blog post. Back then, the conclusion was that GCE instances were more suitable for a typical workload of hosting web services but there was still performance tuning space for HPC applications. Recently, we revisited the GCE’s instances with their latest offering again.

Benchmark Tools
To make the results somewhat comparable with the old ones, we’re still using the OSU Micro Benchmarks but with the latest version 5.3.2. And among all the benchmarking tools being offered, we pick two most critical ones: osu_latency for latency test and osu_bibw for bidirectional bandwidth test.

Test Environment
Operating System: Debian GNU/Linux 8 (jessie)

MPI Flavor: MPICH3

Test Instances
Since we are testing the interconnection performance between VM instances, we want to make sure the VM instances we launched are actually sitting on different physical hosts so the traffic actually goes through the underlying network but not the host machine’s memory.

So we picked the biggest instance of each series:

n1-standard-32, n1-highmem-32 and n-highcpu-32

Test Results
For latency (in microseconds):

Instance Type Trial #1 Trial #2 Trial #3 Average
n1-standard-32 45.68 47.03 48.46 47.06
n1-highmem-32 43.17 43.08 36.87 41.04
n1-highcpu-32 47.11 48.51 48.17 47.93

(size: 0-bytes)

For bidirectional bandwidth: (MB/s)

Instance Type Trial #1 Trial #2 Trial #3 Average
n1-standard-32 808.28 864.91 872.36 848.52
n1-highmem-32 1096.35 1077.33 1055.2 1076.29
n1-highcpu-32 847.68 791.16 900.32 846.39

(size: 1,048,576-bytes)

Summary of Results
For the network latency, we can see the average is around 40 ~ 45 microseconds, which is 4x faster than the previous result – around 180 microseconds. And the new latency is fairly consistent among other smaller instance types.

For bandwidth, we don’t have a previous result to compare to but among all the GCE instance types, we found n1-highmem-32 has the best performance which can be as high as 1070 MB/s. This result aligns with GCE’s official document

This article was written by Irwen Song.


Google released TensorFlow (, an open source machine learning library, last November which attracted huge attention in the field of AI. TensorFlow is also known as “Machine Learning for Everyone” since it is relatively easy to hands on even for those who don’t have much experience in machine learning.  Today we are excited to announce that TensorFlow is now available on Rescale’s platform.  This means you can learn to create and train your machine learning models using TensorFlow with just a web browser.  I’ll walk you through how in this blog post.

Let’s Start With a Simple Case

We’ll start from the first official TensorFlow tutorial: MNIST for ML beginners.  It introduces what the MNIST is and how to model and train it with softmax regression, a basic machine learning method, in TensorFlow.  Here we’ll be focusing on how to set the job up and run it on the Rescale platform.

You can create the python script in a local editor

The script above is just putting all the snippets together.  Now, we need to run that on Rescale’s GPU hardware.

First, you need to create an account, if you still haven’t, click here to create one.

If you want to skip the hassle of setting up the job step-by-step, you can also click here to view the tutorial job and clone it into your own account.

After account registration, login to Rescale and click “+ New Job” button on the top left to create a new job.

Screen Shot 2016-04-15 at 1.28.08 PM

Click “upload from this computer” and upload your python script to Rescale.

Screen Shot 2016-04-15 at 1.29.47 PM

Click “Next” to go to the Software Settings page and choose TensorFlow from the software list.  Currently 0.71 is the only supported version on Rescale, so choose this version and type “python ./” in the Command field.  Select “Next” to go to the Hardware Settings page.

Screen Shot 2016-04-15 at 1.39.15 PM

In Hardware Settings, choose core type Jade and select 4 cores.  This job is not very compute intensive, so we choose the minimum valid number of cores.  We can skip the post-processing for this example, and click “Submit” on the Review page to submit the job.

Screen Shot 2016-04-15 at 1.39.46 PM

Screen Shot 2016-04-15 at 2.01.08 PM

It will take 4 – 5 minutes to launch the server and 1 minute to run the job.  When the job is running, you can use Rescale’s  live tailing feature to monitor the files in the working directory.

After the job is finished, you can view the files from the results page.  Let’s take a look at process_output.log which is the output from that python script we uploaded.  At the third line from the bottom, we can verify that the accuracy is 91.45%.

Screen Shot 2016-04-15 at 2.06.17 PM

A More Advanced Model

In the second TensorFlow tutorial, a more advanced model is built with a multilayer convolutional network to increase the accuracy to 99.32%.

To run this advanced model on Rescale, you can simply repeat the process of the first one and replace the python script with the new model from the tutorial.  You can also view and clone an existing job from here.

Single GPU vs. Multiple GPU Performance Speedup Test

If you have more than one GPU on your machine, TensorFlow can utilize all of them for better performance.  In this section, we are going to do a performance benchmark on a single K520 GPU machine vs. a 4 K520 GPUs machine and test performance speedups.

The CIFAR10 Convolutional Neural Network example is used as our benchmarking job.  From the result below we can see that with 4 times the number of GPUs, the examples being processed per second are only 2.37 times the single GPU performance.


Work Ahead

TensorFlow has just released a new distributed version (v0.8) on 4/13/2016 which can distribute the workload across the GPUs on multiple machines.  It would be very interesting to see its performance under a multi-node-multi-GPU cluster.  Before that, we’ll make the process to launch a multi-node-multi-GPU cluster with TensorFlow support on Rescale as simple as possible.


Import this job to your account

This article was written by Irwen Song.


When we start prototyping our first web application with Django, we always tend to create one Django app and put all the models into that app.  The reason is simple – there are not that many models and the business logic is simple.  But with the growth of the business, more models and business logic get added–one day we might find our application in an awkward position: it’s harder and harder to locate a bug and takes longer and longer to add new features, even if they are simple.  In this blog we’ll talk about how to use different Django apps to reorganize models and business logic so that it scales with the growth of our business.  We will also illustrate the flow of the change with a simple case study.

Prototyping stage – a simple case study

We start from a simple application called “Weblog” which allows the users to create and publish blogs. We create an app called weblog. And the models are as follows.

In weblog/

 Now assume the rest of the application is completed based on the models above. The users can now login, create and publish their blogs using our application.

Evolving approach I – keep adding new business logic into the same app

Say we have a new requirement. In order to attract more authors to create content using our application, we’ll pay the authors based on view counts. The price is $10 for every 1000 views. And the payout is sent once a month.

Since the new requirement sounds pretty simple, a regular approach that puts the new models and logic into the existing app is good enough. First, we add a new model in the “weblog” app like below:

In weblog/

For each new view of a blog, we increase the count based on the month or create a new MonthlyViewCount record if this is the first view in this month. The code looks like:

At the end of each month, we run a cron task which aggregates all the view counts for each author and sends the payment to them accordingly. Here’s the pseudo code:

In weblog/

The approach above seems fine for handling a simple change request like this. But there will always be new requirements coming in.

Evolving approach II – organize the business logics into different Django apps

Now we have a new requirement. To encourage the authors to create content in certain categories, the business team wants to adjust the award strategy into a category-based one. Each category will have a different award price. Say the award price table looks like following:

Price (per 1000 views)
Tech $15
Sports $10
Fashion $5

This new requirement also looks simple enough that we can just create a new model in the existing app to store the category base price and update the cron task to look for the category-based price when aggregating the total. The whole change will take less than 30 minutes and everything is good to go.

But there are two major problems with cranking more and more new models and business logic into the main ‘weblog’ app.

  1. The main app becomes responsible for business logic of different domain knowledge. The class files become bigger and unmaintainable.
  2. The agility is compromised since it is harder to to debug an issue and adding new features is slower.

In Django, we can use different apps to organize the business logic of different domains and Signals to handle the communication between apps. In our example, we’ll try to move all the billing-related models and methods into the a new app called ‘billing’.

First we move all the billing-related models into the new billing app.

In billing/

Now for each new view of any blog article, the billing app needs to be informed so that it can record them accordingly. To do so, we can define a signal in the “weblog” app and create a single handler in the “billing” app to process the signal received.

We move the Blog.increase_view_count()into billing/ as a signal handler:

Then a new signal is created in weblog/

And we also need to inject a signal-sending snippet in one of view methods in weblog/

Finally we can move the billing-related cron task send_viewcount_payment_to_authors from weblog/ to billing/ and add new logic to handle the new category-based pricing.

Although compared with the regular approach, which simply puts everything new into the main app, the approach above needs more code changes and refactoring, it does have several merits that make it worthwhile.

  1. The business logic from a specific domain is segregated from the other domains, which makes the code base easier to maintain.
  2. If an issue occurs during the runtime, the cause can be promptly located in the scope of an app based on the symptom. This shortens the debugging time.
  3. When a new developer onboards, they can start working on a single app first, which will moderate the learning curve.
  4. If we decide to deprecate the whole set of business logics in a specific domain (e.g. all the billing features are no longer needed), we can simply remove that app and everything else should continue to run normally.


A lot of startups are using Django to prototype their product or service. Additionally, Django can handle the growth of their business pretty well.   An important aspect is to rethink and reorganize the business logic into different apps from time to time and keep the responsibility of each app as simple as possible.

This article was written by Irwen Song.