HPC Systems Engineering

Example dependency graph computed by HPCPorts and used to install and upgrade packages in the correct order.

Software Package Management

Consistent management of software stacks is an often-neglected area on HPC systems. A typical GNU/Linux distribution comes by default with a set of packages that are either consistently compiled on a build server and then distributed, or compiled locally with single compiler version. However, these packages are usually built with only a medium level of optimization. And of course not every piece of software is available as a package provided by the OS. Additionally, HPC systems often run older (but more stable) versions of a given OS.

There is a clear need for a mechanism to manage optimized and consistently-compiled software in the HPC world. This was my reason for creating HPCPorts.

Data Management

The HPSS tape storage system is an essential tool for data management at NERSC and other facilities. The interfaces to this equipment are fairly crude, and consist of ftp-like (hsi) and tar-like (htar) clients. I have frequently missed having features of a software version control system for moving data around. Specifically, I wanted to keep a "master" copy of my data on HPSS and then do "checkouts" of the data onto the NERSC global filesystem and to the local scratch disks on individual machines. I also wanted to conveniently "commit" data back to the master copy and find out the current status of an active checkout. This led to writing a small perl program called hsync.

$> hsync
<**************************************************************>

  Usage:  hsync [<options>] <command> [<option 1>] [<option 2>] ...

          General Options are:
            -v (print hsync version)
            -d (force deletion of missing files on update/commit)
            -A <authmethod> (passed to hsi)
            -h <host/port> (passed to hsi)
            -k <keytab path> (passed to hsi)
            -l <login name> (passed to hsi)
            -S (disable staging during update)
            -b (for debugging, status prints all local and hpss commands)

          Available commands and options are:

          checkout <path to HPSS directory> [<optional local directory>]
            Get a local copy of the specified HPSS directory.

          commit [<optional path>]
            Recursively synchronize files from current
            working directory (or optional path) to HPSS.
            The local directory must be the result of a
            previous checkout command.

          update [<optional path>]
            Recursively update local copy of files in current
            working directory (or optional path) from HPSS.
            Locally modified files (with newer timestamps)
            are not overwritten.

          status [<optional path>]
            Display the changes that will happen during the
            the next update or commit on the current working
            directory or the specified path.
            U|A  : next update adds this local file
            U|M  : next update overwrites this existing local file
            U|D  : next update deletes this local file
            C|A  : next commit adds this file to HPSS
            C|M  : next commit overwrites HPSS file with local one
            C|D  : next commit deletes HPSS file

          export [<optional path>]
            Recursively remove the .synchsi files in current working
            directory or specified path. hsync will no longer
            consider this directory to be a checkout.

<****************************************************************>

Given the feature set of hsi (and my lack of desire to write code that directly interfaces with the HPSS libraries), there are limits to how "smart" hsync can be. Basically hsync just looks at the modification times of files to see which file is newer. It also checks the type of each object, so can deal with name collisions between files and directories. If both local and HPSS files have the same timestamp but different sizes, it assumes that the larger file is the "correct" one. Unlike CVS or Subversion, only whole files are copied to and from HPSS. To find the version of hsync you are currently using, just do:

$> hsync -v
   /usr/common/homes/k/kisner/software/bin/hsync is version 1.0

hsync assumes that you are starting with a directory on HPSS that is the "true" version of the data. You can "check out" working copies of the data on multiple local disks. BEWARE: committing changes back to HPSS is up to the user. If you modify a local copy of the same file in multiple checkouts, there is no "merging" of those changes. The first file you commit will overwrite the HPSS copy, and the second file you commit will overwrite the first!

When doing a checkout, you can use either relative or absolute paths for the HPSS location of a directory. If the local path is specified, hsync will try to create a directory with the specified name. If the local path is not specified, it defaults to a directory in the current working directory with the same basename as the HPSS location.

Checkout

Checkout a directory called "sync" in your home directory on HPSS. This directory must already exist, but can be empty. Write the data locally to /scratch/scratchdirs/user/sync.

$> hsync checkout sync /scratch/scratchdirs/user/sync

Display the Current State

Now go into the previous checkout directory (initially empty in this example) and create some files:

$> cd /scratch/scratchdirs/user/sync
$> touch file1
$> touch file2
$> touch this\ file\ has\ spaces
$> mkdir -p subdir/subsubdir
$> touch subdir/subsubdir/nested_file

Before doing a commit or update, it is always good to check on the current state of the directory to see what changes will occur:

$> hsync status
   hsync:  Looking up root path of HPSS files...      DONE
   hsync:  Retrieving list of HPSS files...           DONE
   hsync:  (0 HPSS files to consider)
   hsync:  Computing differences with local copy...   DONE
   hsync:  (4 local files to consider)
   -+-----------------------------------------------------
      No Changes on Update
   -+-----------------------------------------------------
   C|A /home/u/user/sync/file1
   C|A /home/u/user/sync/file2
   C|A /home/u/user/sync/this file has spaces
   C|A /home/u/user/sync/subdir
   C|A /home/u/user/sync/subdir/subsubdir
   C|A /home/u/user/sync/subdir/subsubdir/nested_file
   -+-----------------------------------------------------

From this listing, we see that running "hsync update" will produce no changes, while running "hsync commit" will add the specified files to HPSS.

Commit

Now that we are satisfied that we know what changes will happen, we can commit the files:

$> hsync commit
   hsync:  Looking up root path of HPSS files...      DONE
   hsync:  Retrieving list of HPSS files...           DONE
   hsync:  (0 HPSS files to consider)
   hsync:  Computing differences with local copy...   DONE
   hsync:  (4 local files to consider)
   put -p ./file1 : /home/u/user/sync/file1
   put  '/scratch/scratchdirs/user/sync/file1' : \
     '/home/u/user/sync/file1' ( 0 bytes, 0.0 KBS (cos=5))
   put -p ./file2 : /home/u/user/sync/file2
   put  '/scratch/scratchdirs/user/sync/file2' : \
     '/home/u/user/sync/file2' ( 0 bytes, 0.0 KBS (cos=5))
   put -p ./this\ file\ has\ spaces : \
     /home/u/user/sync/this\ file\ has\ spaces
   put  '/scratch/scratchdirs/user/sync/this file has spaces' \
     : '/home/u/user/sync/this file has spaces' \
     ( 0 bytes, 0.0 KBS (cos=5))
   mkdir /home/u/user/sync/subdir
   mkdir: /home/u/user/sync/subdir
   mkdir /home/u/user/sync/subdir/subsubdir
   mkdir: /home/u/user/sync/subdir/subsubdir
   put -p ./subdir/subsubdir/nested_file : \
     /home/u/user/sync/subdir/subsubdir/nested_file
   put  '/scratch/scratchdirs/user/sync/subdir/subsubdir/nested_file' \
     : '/home/u/user/sync/subdir/subsubdir/nested_file' \
     ( 0 bytes, 0.0 KBS (cos=5))
   hsync:  HPSS changes complete.

As you can see, the HPSS information is echoed to the terminal, so we can monitor the progress.

Update

Now let us assume that some of these example files have been modified elsewhere (either directly on HPSS or from a commit on another machine). Also, we have deleted the file named "file2" and modified "this file has spaces" in our local checkout. As always, we do a status to see what will happen on an update or commit:

$> hsync status
   hsync:  Looking up root path of HPSS files...      DONE
   hsync:  Retrieving list of HPSS files...           DONE
   hsync:  (5 HPSS files to consider)
   hsync:  Computing differences with local copy...   DONE
   hsync:  (3 local files to consider)
   -+-----------------------------------------------------
   U|A ./file2
   U|A ./file3
   U|M ./subdir/subsubdir/nested_file
   -+-----------------------------------------------------
   C|M /home/u/user/sync/this file has spaces
   -+-----------------------------------------------------

We can see that an "update" will restore the locally deleted "file2", add a new local file called "file3", and overwrite the "nested_file" with a newer version from HPSS. If we do a commit, our updated version of "this file has spaces" will be written to HPSS.

Deleting Files

In the previous steps, the commit and update commands will never delete files locally or on HPSS. Obviously there are times when we actually do want to delete some files. In those cases, you can specify the "-d" command line option. For example, if we use this option in the previous call to "hsync status", we see what would happen during a commit or update with the -d option:

$> hsync -d status
   hsync:  Looking up root path of HPSS files...      DONE
   hsync:  Retrieving list of HPSS files...           DONE
   hsync:  (5 HPSS files to consider)
   hsync:  Computing differences with local copy...   DONE
   hsync:  (3 local files to consider)
   -+-----------------------------------------------------
   U|A ./file2
   U|A ./file3
   U|M ./subdir/subsubdir/nested_file
   -+-----------------------------------------------------
   C|M /home/u/user/sync/this file has spaces
   C|D /home/k/kisner/sync/file2
   C|D /home/k/kisner/sync/file3
   -+-----------------------------------------------------

And we see that if we do a "hsync -d commit", then the locally missing files "file2" and "file3" will be deleted on HPSS.

Sub Directories

Doing an "hsync checkout" recursively creates a file called ".synchsi" in each directory. This file contains the corresponding location on HPSS. This allows the hsync command to work on only a subdirectory inside the top level checkout. This can be useful when you are only working on a small subset of the files in a large checkout. If you rename subdirectories inside the checkout (either locally or on HPSS), hsync will treat that as the creation of the new directory and the deletion of the old directory. The (now out of date) .synchsi file will be updated upon the next commit. To take a local checkout and completely remove the ".synchsi" files, simply do:

$> hsync export

but remember: once you do an export, you can no longer use hsync on the directory, and must do a fresh checkout from HPSS.

Rules of Thumb

There are several points you should always keep in mind:

  • hsync should not be used on directories with many small files.
  • No merging is done on commit/update. Newer files completely replace older ones.
  • If working on the same file in multiple checkouts, commit changes in each local checkout before moving to the next. Do an update before starting work in the new location.
  • If you delete a file and commit the change with the "-d" option, be sure to use "hsync -d update" in any other checkouts where you would like this deletion to be registered.
  • Unlike subversion or CVS, hsync cannot commit/update specific files; it works only on whole (sub)directories.

Download

You can download the latest version (1.1) of hsync, or you can always find the latest version on the NERSC machines at ~kisner/software/bin/hsync.

Change Log

  • 1.1 : Add "-D" option to hsi ls, so that we can compare full timestamps.
  • 1.0 : Initial Release