Software’s Key Role in Getting the Most Out of Cloud’s Computational Power

Much focus is directed towards hardware when it comes to parallel performance scalability of solving massive engineering problems. When distributing a large partitioned CFD or FEA model over many nodes, the lower latency network becomes exponentially more important as the number of nodes increases. However, for today’s post, I would like to shift attention away from the hardware and take a short look at what some independent software vendors (ISV) do on the software side to make sure that computational cores sit idle as little as possible.

Convergent Science’s CONVERGE CFD code takes a less traditional approach, implementing Adaptive Mesh Refinement (AMR) into their solver. AMR is not unique to CONVERGE–it has been implemented in many software including open source codes like Gerris–however, one of CONVERGE’s main applications is the internal combustion engine where moving boundaries and high gradients are the perfect application for using AMR.

In the internal combustion engine, locations of high gradient may move from the valves to the injector tip to the spray region. To obtain the transient solution and maintain high fidelity in these high gradient locations the orthogonal mesh elements may be sub-divided into 8 smaller elements multiple times.

Because individual compute cores, in general, solve continuous regions of the domain–to minimize communication with other cores–any new elements added to the domain will invariably be added to one core. When they’re subtracted, only one core may see the computation benefit of having less elements to solve. The AMR would cause major load imbalance were it not for the built in load-balancing algorithm. Of course load-balancing itself takes computational time, so the CONVERGE user can specify a time interval for when to re-balance the cells over all the cores.


CONVERGE load balance can be monitored using Rescale’s Live Tailing feature

In summary:

  • AMR reduces the overall computational cost and increases the solution accuracy compared to statically meshed domains
  • AMR ensures that computationally expensive regions are not expensive throughout the entire simulation
  • AMR settings can be customized to match the user’s desired solution accuracy
  • Built-in load balancing ensures that cores don’t sit idle waiting for other cores to solve computationally expensive regions

CONVERGE is available on demand on Rescale and is now easier than ever to run in just 4 easy steps:
(1) Select CONVERGE from the analysis list
(2) Select the number of cores
(3) Upload zip file of inputs
(4) Click “Submit” to run the job

When it comes to solving mechanical problems, non-linear jobs are some of the most computationally expensive to solve. The premier non-linear solver is LS-DYNA, developed by Livermore Software Technology Corporation (LSTC) and integrated in many third-party packages for its excellent reputation. Like Convergent Science, LSTC focuses its main effort on the development of its solvers.

The mechanical problems which deal with plasticity and high deformations require a non-linear solver. One of the main applications for this type of simulation is the crash-test simulation. The point of impact sees the most deformation and requires the most computation to solve. The traditional Recursive Coordinate Bisection method (RCB) approach splits the domain along its widest axis, leaving the front of the car to be solved by a single core, while the rest of the car, which sees no deformation, is solved by all the other nodes. This kind of partitioning causes bad load imbalance.

As a solution, LS-DYNA allows the user to specify several decomposition settings. For example, this allows the user to partition the domain of a car model into small slivers length-wise. This kind of decomposition spreads the computationally expensive regions among all cores when modeling a head-on crash. Side impact crash simulations would not benefit from this decomposition. A more in-depth description is excellently explained here.

On Rescale, the user may add decomposition parameters by specifying a “pfile”. The pfile may contain a definition to specify the parts with sliding contacts (part 6) and the scaling factor in certain directions (scale y by 24x):

To use a pfile on Rescale, specify it from the Analysis command line. To run a single-precision mpp-hybrid (version 7.1.1) crash simulation over 64 cores (8 dmp ranks each utilizing 8 smp threads) the Rescale user would run:

In short:

  • LS-DYNA’s RCB decomposition algorithm allows custom specification of important model parts and scaling along 1 or more axes
  • LS-DYNA provides a hybrid mpp solver which uses MPI as well as openMP technology to most efficiently scale very large problems
  • The custom decomposition settings can be easily set in a pfile, so when experimenting with decomposition, the user will only have to re-upload a small file to Rescale as opposed to having to change and re-upload the large model file

Both LS-DYNA and CONVERGE are available on Rescale with short-term or hourly on-demand licensing. Please contact if you are interested in running these software on Rescale.

This article was written by Mulyanto Poort.