The Rescale platform provides end-to-end file management backed by storage offerings from the major public cloud vendors. This includes optimized client-side transfer tools as well as in-transit and at-rest encryption. In this model, Rescale controls the object store layout and encryption key management. In order to retrieve the decrypted file content, users must use Rescale tooling. While this can be convenient if you are starting from scratch and looking for a totally managed secure solution, one scenario that comes up is how to use the platform with input data that has already been uploaded to the cloud. Another use case that we see is integrating with an existing data pipeline that operates directly on simulation output files sitting in a customer-controlled storage location. For cost and performance reasons it is important to try and keep your compute as close to the storage as possible. One of the benefits of Rescale’s platform is that we support a number of different cloud providers and can bring the compute to any cloud storage accounts that you might already be using.

In this post, we will show how customers can transfer input and output files from a user-specified location instead of using the default Rescale-managed storage. For this example, we’ll focus on Amazon S3, however a similar approach can be used with any provider. In the following, we will go through the setup of a design of experiments job where the input and output files reside in a customer-controlled bucket. Let’s assume that the bucket is called “my-simulation-data”, the input files are all prefixed “input”, and all output files generated by the parameter sweep should be uploaded to a path prefixed by “output”.

This DOE will run over the HSDI and Pintle Injector examples for CONVERGE CFD found on our support page (https://support.rescale.com/customer/en/portal/articles/2579932-converge-examples) in parallel. Normally, the DOE framework is used to change specific numerical values within an input file but here we will use it to select completely different input.zips per run.

First, upload the converge input zips (hsdi_fb_2mm.zip and pintle_injector.zip) to the s3://my-simulation-data/input/ directory in s3.

Next, create a file locally called inputs.csv that looks like the following:

In order to give Rescale compute nodes to access to the bucket, an IAM policy needs to be created that provides read-access to the input directory and full access to the output directory:

Note that another way to accomplish this is to setup cross-account access (http://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_cross-account-with-roles.html). This is a preferable way to configure access if all compute nodes will run in AWS. However, the above approach will work regardless of where the compute nodes are executing.

Now, attach this policy to an IAM user and generate an access key and secret key. This access key and secret key should then be placed into a AWS config file that you save to a local file:

Save the above to a file called config.

The last file that needs to be created locally is the run script template. We will reference the s3_input and s3_output variables from the inputs.csv created above in a shell script template file that will be executed for each run. Create a file called run.sh.template that looks like:

There are a couple things to point out in the above script. Normally, the Rescale platform will automatically unarchive zip files however in this case we need to handle that ourselves since we are bypassing Rescale storage for our inputs. The rm -rf * at the end of the script deletes all of the output files after uploading them to the user-specified S3 location. If we omit this step, then output files will also be uploaded to Rescale storage after the script exits.

Now that the necessary files have been created locally, we can configure a new DOE job on the platform that references them. From the New Job page (https://platform.rescale.com/jobs/new-job/setup/input-files/), change the Job Type to DOE and configure the job as follows:

  1. Input Files: Upload config
  2. Parallel Settings: Select “Use a run definition file” and upload input.csv
  3. Templates: Upload run.sh.template. Use run.sh as the template name
  4. Software: Select converge 2.3.X and set the command to run.sh
  5. Hardware: Onyx, 8 cores per slot, 2 task slots

Submit the job. When the job has completes, all of the output files can be found in the s3://my-simulation-data/output/hsdi/ and s3://my-simulation-data/output/pintle/ directories.

In this DOE setup, the ancillary setup data (eg: the AWS config file, csv file, and run script template) are encrypted and stored in Rescale-managed storage. The meat of the job, the input and output files, are stored in the user-specified buckets.

We do recognize that the above setup requires a little manual work to get configured. One of the things on our roadmap is to provide better integration with customer provided storage accounts. Stay tuned for details!

This article was written by Ryan Kaneshiro.

ryanblogpost (1)
The web is the preferred delivery mechanism for most applications these days but there are scenarios where you might want to build a CLI or desktop application for your customers to use. However, once you leave the cozy confines of the browser there are a whole slew of proxy configurations that your poor application will have to deal with if it needs to run within a typical corporate network.

For the purposes of this post, “typical corporate network” means your users are running some flavor of Windows and are sitting behind an authenticating HTTP proxy. While this does seem like a pretty common setup, a surprising number of applications will simply not work in this environment.

Thankfully, when writing a .NET application, the default settings get you most of the way there for free. The default web proxy will automatically use whatever proxy settings the user has configured in IE. If possible, this is what you should rely on. It is tempting to expose proxy hostname and port configuration values that the user can pass to the application, however in some cases a corporate user may not have a single well-known proxy to use. WPAD and PAC files allow proxies to be configured dynamically. See this post for more gory details.

Unfortunately, the default settings do not handle authentication for you out-of-the-box. Web requests will typically fail with a 407 ProxyAuthenticationRequired error. The next step is to examine the Proxy-Authenticate response header returned to see what type of authentication the proxy accepts. Typically this will be some combination of Basic, Digest, NTLM, or Negotiate. If the proxy supports either NTLM or Negotiate, then it is possible to automatically authenticate the signed in user running your application by simply adding the useDefaultCredentials=true attribute to your app.config as described here:

This is particularly nice because we don’t have to modify any of our application code nor deal with the headaches of dealing with credential management. Alas, this won’t work if the proxy is configured to use Basic or Digest authentication. While this is an unusual setup, it is something that you will come across in the wild every so often. If this is the case then you will need a way to read in a username and password and then store that in the IWebProxy.Credentials property. As pointed out here, this setup is not typically used because it puts the burden on every application to manage proxy credentials.

In C#, the default proxy settings configured in the app.config are reflected in the WebRequest.DefaultWebProxy static variable. Rather than directly modifying its Credentials, it is cleaner to create a decorator for the proxy that passes through the read requests but manages its own set of credentials without touching the underlying proxy:

Then, you can do something like the following to use the default proxy settings with custom credentials:

This lets you easily switch back to the original credentials that were configured in the app.config or use a different set as needed.

Note that while all of this is pretty straightforward for people using .NET, it might not be as easy to support authenticating proxies (particularly ones that only use NTLM and Negotiate) in http libraries used in other languages. In these scenarios, some people have had success using cntlm as a proxy for the authenticating proxy.

TL;DR: For people writing applications in .NET, you should simply set useDefaultCredentials=true in your app.config file and that should “just work” most of the time.

This article was written by Ryan Kaneshiro.