complexity

The world is incredibly complex, and nowhere is this more evident than the art of writing software. Software is the result of intricate interactions between social structures, business desires, and limited knowledge. The complexity of those interactions inevitably shows up in code structure. Our job as engineers is to manage this, and prepare our abstractions to handle increasing complexity over time. The most important technique in this regard is to make space for pieces of logic to take on more complexity without drowning in it – with complexity, the dose makes the poison.

This is why one of the most common pieces of software writing advice that you’ll hear is to write lots of small, focused classes with logic split up into lots of small, focused methods. Software written in this style provides space for any individual piece to grow in complexity. On the Rescale development team, we’ve found that the time to split off logic from a growing method or class is much earlier than developers typically think. We prefer to start pulling out abstractions as we write first and second versions of classes and methods.

To illustrate with an example, we were recently writing some code to parse third party xml files that described workflows – files to transfer, analyses to run, and variable definitions. At first our parsing code was relatively simple, mostly because we didn’t know everything that we would need to parse. We started by just parsing input files and variables. Each of those are represented by an xml node, and each xml node indicates its type with an attribute named type, furthering indicating important values with attributes dependent on the type.

Some initial parsing code looked like:

The helper method getNodeStream above converts a NodeList from the document into a stream for ease of manipulation.

There are two things to notice about this code – it uses magic strings instead of constants, and duplicates code to extract attribute values. After applying those simple refactorings, the parsing code is less cluttered with implementation details and reads more closely to its intent:

This is our first example of how best practices can help prepare code for increasing complexity. On the surface, it doesn’t seem like we’ve done much, but by writing code that’s closer to what we mean, rather than what the computer does, we’ve made it easier to add more complexity to this code because we’ll have less context to keep in our heads as we write new additions.

If parsing out these collections of strings was all we ever did with this xml, it would be fine to leave this code as is. But, this being software, things got more complex. After writing out the third or fourth type/attribute parsing pair, we decided to extract some enums:

Now the parsing code looks like:

It seems like it might be overkill to extract logic here, but we were motivated to do so because the enums provide a central location to define these constants for reuse. Another benefit we soon learned of was that they provided space for the different pieces to become more complex.  Now you might be thinking, it’s just pulling out an attribute from an xml node, how could that become more complex? We certainly didn’t think it would or could.

But it turns out that in this third party xml, some nodes reference files with an attribute named filename and some referenced files with an attribute named fileName. This is the kind of thing that makes programmers curse, but luckily we were prepared to handle this with ease:

None of our other parsing code had to change. If we had kept on using string constants, we would have had to make many updates across the parsing code to check for filename or fileName, or else write special methods for nodes referencing files. The code would have gotten more cluttered with if/else logic. Since we abstracted early, though, we had a place to put this logic. We want to reiterate that we didn’t expect this difference in attribute casing, but that’s exactly the point – you should expect code to become more complex in ways you don’t expect.

Why do some nodes use filename and others use fileName? We can guess that two different people worked on serializing the different nodes, and they didn’t know the capitalization scheme the other had chosen. Perhaps they communicated verbally and decided on “filename” as an attribute, but one used camel casing. Or perhaps they worked on one node after the other, and forgot what capitalization scheme had been used.

Whatever the case may be, the complexity of the social structure of the third party’s development team is manifesting in this difference of attribute names, and showing through to our codebase. It’s our job to be prepared for that kind of complexity, to be ready to handle it with appropriate structures.

This article was written by Alex Kudlick.

IDC_Logo

This article is written and published by Steve Conway, Vice President of  IDC Research, for IDC Link.

In IDC’s worldwide studies of high performance computing (HPC) end-user sites, the proportion of sites employing cloud computing—public or private—has steadily grown from 13.8% in 2011, to 23.5% in 2013, to 34.1% in 2015. Also represented in this mix is the growing contingent of hybrid clouds that blur the public-private distinction by combining on-premise and external resources.

Our research shows that there are persistent concerns about data security in public clouds, even though data security and confidentiality have generally improved. For example, Amazon Web Services, the most popular public cloud among HPC users, is now compliant with HIPAA, the federal Health Insurance Portability and Accountability Act designed to safeguard the privacy and security of health information. Another important brake on sending work off premise has been the widespread perception that this means using public clouds that are suitable only for embarrassingly parallel workloads—ones that can readily be subdivided into smaller jobs, each of which can be run independently.

San Francisco-based Rescale is one of the newer companies (2011) that is altering the landscape of what can be done effectively beyond an organization’s firewall. Nearly all of the privately held company’s principals have strong backgrounds in fields where structural analysis and fluid-structures interactions are important. Founder and CEO Joris Poort began his career as a structural and software engineer working on the Boeing 787 “Dreamliner” airplane, as did co-founding CTO Adam McKenzie. Sales VP Tony Spagnuolo headed aerospace sales at MSC.Software following stints at Rockwell International and General Motors. Marketing VP Shing Pan led solver product marketing at Altair Engineering. The company has deep bench strength in low-latency, IO-intensive HPC.

The result is a business model and products (ScaleX Enterprise, ScaleX Developer, and ScaleX Pro) that exploit contracts not only with public cloud service providers, but also with private clouds and large HPC centers. Rescale matches customer requirements to the appropriate resource or resources, leveraging ISV partnerships with Ansys, CD-adapco, Dassault Systemes, ESI, MSC.Software, and Siemens PLM, along with popular ISVs in fields including energy, life sciences and weather forecasting, plus open source software such as OpenFoam. Customers decide where their data is stored. Aerospace and defense organizations are among the company’s initial customers. As expected, pricing options are risk-based, with advance bookings priced lower than last-minute requests. Rescale says it addresses public cloud data security through encryption, including ITAR standards if needed.

Rescale is one of only a few companies accommodating requirements for low latency interconnects and IO-intensive computing outside of customers’ firewalls. This small cadre of vendors will need to perform some market education to help dispel archaic notions of what is possible off premise. Meanwhile, Rescale reports that its year/year revenues grew three-fold in 2014. IDC believes that the management team has created a business model that positions Rescale well to benefit from the growth IDC expects in HPC cloud computing and related outsourced services.

Download article here.

This article was written by Steve Conway.

compress

One of the key challenges with cloud HPC is minimizing the amount of data that needs to be transferred between on-premise machines and machines in the cloud. Unlike traditional on-premise systems, this transfer occurs over a much slower and less reliable Wide Area Network. As we’ve touched on previously, the best thing to do is perform post-processing remotely and avoid transferring data unnecessarily.

That said, a common scenario for many users is to run a simulation and then transfer all of the output files from the job back to their workstation.

After a job has completed, each file in the working directory is encrypted and uploaded to cloud storage. This provides flexibility for users that only need to download a small subset of the output files to their machine. However the tradeoff is that each file introduces additional overhead in the transfer. When transferring data over a network, the more data that can be packed into a single file, the better. Further, many engineering codes will emit files that are highly compressible. Although compressing a file takes extra time, this can still be a net win if the time spent compressing plus transferring a smaller file is less than the time spent uploading the larger, uncompressed archive. Even if the compression and ensuing transfer takes longer overall, the real bottleneck in the overall transfer process is going to be the last hop between cloud storage and the user’s workstation. Having a smaller compressed file to transfer here can make an enormous difference depending on the user’s Internet connection speed.

If you know beforehand that you will need to download all of the output files for a job, then in general it is best to generate a single compressed archive file first instead of transferring each file individually. The linux tar command provides an easy way to create a compressed archive however it does not utilize the extra computing power available on the MPI cluster to generate the archive.

Jeff Gilchrist has developed an easy-to-use bz2 compressor that runs on MPI clusters (http://compression.ca/mpibzip2/). We compiled a Linux binary with a static bzip2 library reference and have made it available here for download to make it easier to incorporate it into your own jobs. The binary was built with the OpenMPI 1.6.4 mpic++ wrapper compiler. Please note that it may need to recompiled depending on the MPI flavor that you are using.

To use it, upload the mpibzip2 executable as an additional input file on your job. Then, the following commands should be appended to the end of the analysis command on the job settings page.

tar cf files.tar –exclude=mpibzip2 *

mpirun -np 16 mpibzip2 -v files.tar

find ! -name ‘files.tar.bz2’ -type f -exec rm -f {} +

First, a tar file is created called files.tar that contains everything except the parallel bzip utility. Then, we launch the mpibzip2 executable and generate a compressed archive called files.tar.bz2. Finally, all files except files.tar.bz2 are deleted. This prevents both individual files AND the compressed archive from being uploaded to cloud storage.

Note that the -np argument on the mpirun call should reflect the number of cores in the cluster. Here, the commands are being run on a 16 Nickel core cluster.

One additional thing to be aware of is that Windows does not support bz2 or tar files by default. 7-Zip can be installed to add support for this format along with many others.

As a quick test we built compressed archives from an OpenFOAM job that contained 2.1 GB worth of output data spread over 369 files and uploaded the resulting file to cloud storage.

image

As a baseline, we built an uncompressed tar file. We also tried creating a gzip compressed tar file using the -z flag with the tar command. Finally, we tried building a bz2 compressed archive with 8, 16, and 32 Nickel cores.

Not surprisingly, in the baseline case, building the archive takes a negligible amount of time and the majority of the overall time is spent uploading the larger file. When compressing the file, the overall time breakdown is flipped: The majority of the time is spent compressing the file instead. Also unsurprisingly, leveraging multiple cores provides a nice speedup over using the single-core gzip support that comes with the tar command. At around 16 cores, the overall time is roughly the same as the baseline case.

The real payoff for the compression step however will become evident when a user attempts to download the output to his or her local workstation as the compressed bz2 file is almost 5 times smaller than the uncompressed tar (439 MB vs 2.1 GB).

To reiterate, we believe that pushing as much of your post-processing and visualization into the cloud is the best way to minimize data transfer. However, for those cases where a large number of output files are needed, you can dramatically reduce your transfer times in many cases by spending a little bit of time preparing a compressed archive in advance. We plan on automating many of the manual steps described in this post and making this a more seamless process in the future. Stay tuned!

This article was written by Ryan Kaneshiro.

scalex-enterprise

New platform solution empowers CIOs and IT professionals to transform legacy on-premise systems into an agile HPC environment that accelerates time to market and drives product innovation

San Francisco, May 18, 2015 – Rescale, an industry leader in computer aided engineering (CAE) simulation and high performance computing (HPC), announced availability today of the innovative cloud HPC and simulation solution ScaleXTM Enterprise.

ScaleX Enterprise is Rescale’s secure and flexible solution for enterprise companies, combining its award-winning engineering simulation platform with a powerful administrative portal and comprehensive developer toolkit. ScaleX Enterprise is designed for CIOs and IT professionals to instantly and securely deploy elastic hardware and software resources, supplementing their  existing on-premise computing infrastructure while maintaining full control across the entire IT stack.

“Today, a responsive IT environment is critical to support the dramatically increasing and highly variable user demand for simulation,” said Joris Poort, CEO of Rescale. “ScaleX Enterprise makes it possible for CIOs and IT professionals to effectively transform their legacy on-premise IT infrastructure into a dynamic environment that is high performing, scalable, and secure, further improving the business bottom line with better products and accelerating the time to market.”

“The powerful combination of Rescale’s ScaleX Enterprise platform and NVIDIA GPUs will dramatically improve simulation performance for engineers and scientists,” says Jeff Herbst, Vice President of Business Development at NVIDIA.

ScaleX Enterprise’s administrative portal provides a comprehensive set of features to manage  user account settings, configure role-based permissions, control budgets and billing, and implement security policies.  Enterprises can customize and integrate the portal with existing on-premise licenses and hardware resources providing a unified, hybrid environment for all HPC and simulation needs.

“ScaleX Enterprise is an ideal solution for our organization, empowering our engineers to develop the most innovative products and perform groundbreaking research and development for our clients,” says Wayne Tanner, President, Leading Edge Engineering.

With an infrastructure network of over 30 advanced global data centers, ScaleX Enterprise features the latest computing technology from Intel and NVIDIA, as well as over 120 simulation applications, including those developed by leading vendors such as Siemens PLM, CD-adapco, Dassault Systemes, ANSYS, and MSC Software – all optimized, certified, and cloud-enabled. Rescale’s ScaleX Enterprise simulation platform delivers the largest worldwide HPC network and most advanced simulation capabilities directly to enterprise organizations.

Learn more information about ScaleX Enterprise: www.rescale.com/products/enterprise/

About Rescale

Rescale is the world’s leading cloud platform provider of simulation software and high performance computing (HPC) solutions. Rescale’s platform solutions are  deployed securely and seamlessly to enterprises via a web-based application environment powered by preeminent simulation software providers and backed by the largest commercially available HPC infrastructure.  Headquartered in San Francisco, CA, Rescale’s customers include global Fortune 500 companies in the aerospace, automotive, life sciences, marine, consumer products, and energy sectors. For more information on Rescale products and services, visit www.rescale.com.

This article was written by Rescale.