Documentation

The use of the batch cluster can be complicated depending on the applications and data sets. Much of the usage documentation is inter-related and may need to be read a few times to match up different pieces of the cluster configuration. A basic familiarity with the Linux command line is assumed along with some shell scripting.

  • Slurm user guides
  • The developers of the slurm scheduling system (schedmd) has extensive documentation on how to use the batch system. We recommend two documents in particular:

  • Slurm Partitions & Features
  • The slurm scheduler uses partitions as the method to allocate resources to jobs.The FRCE cluster has several partitions defined, with the major ones summarized below.

  • Visual diagram of FRCE Systems
  • The FRCE cluster is a 3000+ core Linux cluster designed for large numbers of simultaneous jobs ranging from microscopy or sequence analysis to chemical modeling.

  • Connecting to FRCE & File Transfers
  • Access is only through ssh to batch.ncifcrf.gov from within the NIH network or while connected through VPN. There are several options depending on your laptop's operating system. Some of the more popular are listed here.

  • Environment Modules
  • Traditionally, all software packages on a Linux (or UNIX) system would be installed in a single directory tree and every application would be automatically available to all users.

  • Available Applications
  • A significant number of third-party applications are installed on the FRCE cluster. An up-to-date list of applications and versions can be gotten with the command module avail at the shell prompt.

  • Storage on FRCE
  • There are several options for file storage on the FRCE cluster, each with its own advantages and disadvantages. Shares available to all users include

  • Biowulf & FRCE differences
  • Biowulf and FRCE have many features in common and most programs and scripts written on one will run on the other with only minor changes. Both run operating systems based on Red Hat Linux and both use Slurm as the resource and scheduling manager.

  • Scientific Database Support
  • Access to various Oracle and Mariadb database servers is supported from FRCE execute nodes. The HPC admin team does not manage accounts on the database servers but can put users in contact with the DBA personnel

  • Software Licenses
  • Some may conclude that discussing some license details may be overly picky but there are significate differences in different license types.

  • Compilers and Interpreters
  • Several compilers and interpreters are available on the system, some included in the distribution and others added in an available through modules.

  • FRCE Users
    • National CryoEM Facility
    • Production support for sequencing labs at the ATRF and Fort Detrick
    • ABCS complex compute initiative
  • Services
  • A range of services is generally available on the FRCE cluster to best support the science performed at the NCI and the FNLCR.

  • Globus
  • Globus is a service that makes it easy to transfer large amounts of data. Globus will manage file transfers in the background using GridFTP and report the status of your data transfer. 

  • vscode IDE
  • Visual Studio Code is a freeware source-code editor made by Microsoft for Windows, Linux and macOS. 

    Features:

  • Jupyter
  • Introduction

    It is recommended to run Jupyter notebook on a compute node. The first part will describe principles to run Jupyter and to ssh tunnel Juypter traffics. The second part will show an integrated process with a SLURM script.