Containerizing the CESM
by Beau Cronin Sunday, October 30, 2016

The key technical hypothesis behind Earthers is that real climate and earth system models can be used to power a simulation game. So some of the first questions that had to be answered were:

  • Are these model codes available?
  • Can they be packaged to run on commodity cloud hardware? 
  • How much do these model runs cost?

Because these seemed like the scariest questions for the whole project - i.e., the biggest risks - these are the issues I tackled first.

Being new to the climate modeling world, I began looking around at the various models that were used to make predictions for the IPCC reports. It turns out that there are a few different classes of climate model, from simple models that contain just a few differential equations and can be run instantaneously, all the way up to so-called "coupled earth system models" that include the spatiotemporal dynamics of the atmosphere, ocean, land, and biosphere. It is these more complex models that can track and predict how the climate will play out at each location (and altitude) on the globe. And because the game I had in my mind used the Earth as its basic "game board", these coupled models are the ones I was drawn to - in spite of their general hairiness and computational requirements.

There are a handful of such models, each developed and maintained by different laboratories and research consortia around the world (such as the ESM2M from GFDL). But I eventually settled on the Community Earth System Model because

  1. It is one of the leading earth system models, and is actively used for cutting edge climate science
  2. It is a so-called "community model", meaning that users outside of the research group that develops the model are invited to use it for their own research, and are supported in doing so
  3. It is under active development and improvement, and has been for decades
  4. It seems to follow the best software engineering practices: the code is available under a reasonably permissive license, uses source control, follows a regular if infrequent public release cycle, etc.

Containerization

That said, the CESM is a beast, intended to be run on specialized HPC hardware. The core numerical routines are written Fortran, config files are mostly XML, and the glue scripts are mainly perl. The input data is mostly stored as NetCDF files, and the data repository is many TB in size. Could something this big - traditionally run on specialized, professionally-administered supercomputers - it be made to run easily and reliably on standard cloud instances?

Given these uncertainties, the first big Earthers project was therefore to package the CESM into a single docker image that would accept parameter settings, and output results to S3. That effort is contained in this github repo, which contains scripts and configuration files to 

  1. Build a docker image that contains the CESM
  2. Start a container of this image
  3. Launch a suitable EC2 instance that installs the necessary dependencies and starts a CESM run using one of these containers. 

This effort was a success, which was the first sign that Earthers had a fighting chance. CESM runs can be initiated with the launch script, with results being written to S3 as they are checkpointed. Metadata about each run is written to Dynamo tables, so that run status and parameters can be tracked.

This is really just a proof of concept, and there's still much to be done before this packaging will be truly useful as part of a larger software system: more robust job control and management, a complete system for setting and tracking model run parameters, and especially the ability to restart runs from previous stopping points (which is essential for the turn-based play of Earthers).

But it's an important start!