While still in its infancy, Deep Learning has already significantly advanced fields such as autonomous driving, robotics control, machine translation, and facial recognition. By being able to abstract large volumes of data, recognize patterns, model and classify them, Deep Learning looks poised to drive disruption and innovation in the years ahead. So what questions do you need to ask to evaluate whether deep learning and deep neural networks are the right solution for your complex task?

Do I have enough data? There are a variety of deep learning model types and architectures, but the common theme across all of them is deep, layered structure. Deep means many interdependent model parameters.  In order for your optimizer to come up with good values of all these parameters, it needs multiple training examples of the task you want it to do. With some exceptions applied to the case of transfer learning, if you do not have a large quantity of data or are unable to generate a large number of examples quickly, you are better off training a smaller, “shallow” model.

Supervised, semi-supervised, or unsupervised? Is the data you have collected labeled with the desired result of the task or not? If the data is all labeled, then you can take advantage of supervised learning techniques. This is the “traditional” application for deep neural networks, and is used for tasks like image recognition, natural language translation, and voice recognition. Convolutional networks are typically used for image-based tasks whereas recurrent networks are used for language-based tasks.

If none of your data is labeled, you can still take advantage of unsupervised learning to learn hidden features and structure within your data. Denoising autoencoders are an example of an unsupervised deep learning model.

The final category, semi-supervised or reinforcement deep learning is newest in the space, pioneered by the work at DeepMind. In this case, your data is sparsely labeled and your model might be able to test new inputs to the system to get feedback. Defining example was learning to play Atari games, but since then applications have been made to robotics and autonomous vehicles.

 Does your system have to explain its decisions? Deep learning models have traditionally been considered black boxes with respect to the predictions it makes. Given the number of parameters that are trained in a deep model, it is generally impossible to reconstruct the “reasoning” behind the answer a model gives. If you need to provide a “why?” along with a “what?”, you are better off choosing a model like decision trees or random forests, which give a set of decisions made to provide a particular answer.

If deep learning models are the right choice, you now face a host of new challenges:

  • Deep learning requires specialized GPU hardware and a lot of it. Many IT organizations new to machine learning just have not yet made the capital investment to have appropriate GPU resources on-premise.
  • Deep learning libraries are evolving very quickly, resulting in the need for frequent updates to stay current. Your deep learning pipeline can quickly become a stack of interdependent software packages that are difficult to keep in sync.
  • How do you manage your large datasets? Where does all that data live?

Rescale’s Deep Learning Cloud, in partnership with IBM, provides an integrated platform to solve the above problems. Leveraging IBM Cloud’s bare metal NVIDIA K80 servers, Rescale’s interactive Desktops provide you with powerful hardware to visualize and explore large datasets and design deep neural network models. When you are ready to scale up and train on large datasets, you can get instant access to GPU compute clusters and only pay for what you use with hourly pricing.

Deep Learning Cloud comes configured with the latest versions of popular deep learning software like TensorFlow and Torch, as well as IBM licensed analytics products such as SPSS. All software is already configured to take full advantage of NVIDIA GPUs via CUDA and cuDNN.

Finally, Rescale’s workflow management and collaboration tools combined with IBM storage and data transfer technology ease the burdens of migrating large datasets to the cloud and managing that data once it is there.

So what does running a deep learning task on Rescale look like? Here are the steps taken by a user to train a new deep neural network from scratch:

Dataset: Upload your image dataset using our optimized data transfer tools, or if your data is already hosted in IBM cloud, you can attach it directly.

Configuration: Set up a cluster through the Rescale web interface, configure the number of IBM Cloud GPUs you want to train on, the deep learning software you want to use, and the training script you want to run.

Launch: Within 30 minutes, your training cluster will be available and running your training script.

Monitor: View training progress via the web or direct SSH access, connect to GUIs such as TensorBoard (part of TensorFlow), and stop your training cluster whenever you want.

Review: Training results are automatically synced back to persistent storage. You can review results from the Rescale portal, download models to use, sync back to your own IBM Cloud storage account, or just use Rescale to run further inference and training on the existing model.

Try Rescale powered by IBM Cloud for free today at

This article was written by Mark Whitney.



Background and Challenge
Adaptive Corporation is a leading Digital to Physical Product Lifecycle Company that helps streamline business processes, reduce costs and improve efficiencies for customers that need to bring new products to market. Our CAE team runs explicit crash/impact simulations and large non-linear analysis for structures and components for our customers, who include leading manufacturers in Industrial Equipment, Aerospace, Auto, and Life Sciences. For these simulations, we use Abaqus, Nastran, Adams, Isight, fe-safe, and Tosca.

Our workload for our CAE consulting is project-based and varies quite a bit during the course of the year. Prior to using Rescale, we were custom building or purchasing our own HPC system. We were resource-constrained.  We either had underutilized computing capacity between projects or not enough capacity when multiple projects were underway at the same time. Too often, we experienced bottlenecks when we needed to solve multiple jobs simultaneously because we were limited with our smaller HPC system. Projects would run over schedule as a result.

Our mission is to craft tailored solutions that help our customers shorten development and production cycles throughout product planning, development, manufacturing, and aftermarket service processes. The simulation and analysis bottlenecks were undermining our ability to do that, so we needed to look for a solution. In addition, our on-premise hardware was out-of-date within 2 or 3 years, so we were spending a lot of time and effort every few years to upgrade our hardware and we needed an IT support team to maintain the servers and licenses.

The Rescale Solution
Two years ago, as bottlenecks became more frequent, we started to look to the cloud for bursting. We chose Rescale due to competitive pricing and their ability to work with us to customize the solution to our IT needs. We ditched our existing HPC system and now have a “reverse hybrid” cloud computing environment, in which we mostly rely on Rescale’s unlimited, on-demand capacity to meet the demands of our simulation engineers but occasionally run jobs on our desktops as well.

We have been running Abaqus on Rescale since 2014 and since we do so much computing on Rescale, we actually host our license on their server as well. On occasion, when we need to run simulations on our desktop computers, we’ll tap into our license server on Rescale. It’s a seamless process that saves us the headache of license management.

Results and Benefits
With instant access to more CPUs and faster hardware on Rescale, optimized to our simulation needs, our CAE consulting business has been able to solve jobs significantly faster than we could before–more than 10x faster! And with the availability of on-demand resources on the cloud, we now have access to hardware that can respond to our constantly changing compute needs. We’ve found it to be much more cost-effective to use the cloud, paying only for the hardware that we actually use, and to always have access to the current hardware offerings.

We’ve also been able to significantly reduce our IT overhead. We’ve eliminated the need for license server maintenance completely. We’ve reduced IT spending dramatically, never have underutilized compute capacity, and our employees can focus on engineering. We can now focus our resources on the value-added business activities, not on managing hardware. Rescale has helped us improve our operational efficiency.

About Adaptive Corporation
Adaptive Corporation is the leading Digital to Physical Product Lifecycle Company that helps streamlines business processes, reduce costs and improve efficiencies for customers who need to bring new products to market.  Adaptive’s growth is powered by working closely with its 500+ customers to overcome the challenges around product development. Our customer base includes leading manufacturers of industrial and consumer products and the suppliers that provide the underlying sub-systems. Adaptive’s unique “Digital to Physical” product portfolio includes CAD/CAM, CAE, PLM, business analytics, metrology, and 3D printing solutions from leading IT providers.  Using a combination of these offerings, the Adaptive team crafts tailored solutions that help our customers shorten development and production cycles throughout product planning, development, manufacturing, and after-market service processes.  For more information, visit

This article was written by Adaptive Corporation.