Purdue Anvil User Guide
Last update: June 18, 2021

Introduction

Anvil, funded by the National Science Foundation (NSF) through award OAC-2005632, is a supercomputer built and operated by Purdue through a partnership with Dell and AMD. Anvil integrates a large capacity HPC computing cluster with a comprehensive set of software services for interactive computing to aid users transitioning from familiar desktop to unfamiliar HPC environments.

Anvil features the 3rd Generation AMD EPYC processors and is designed for high throughput moderate scale jobs. Anvil will provide fast turnaround for large volumes of work, and will complement leadership class XSEDE systems such as Frontera.

Anvil compute nodes provide high core counts (128 cores/node), as well as improved memory bandwidth and I/O, and will support both traditional HPC and data analytics applications and enable integrative long-tail science workflows. A set of 16 GPU nodes and 32 large memory nodes will enable modern machine learning applications. In addition, a composable sub-system provides the ability to deploy science gateway components, long running processing pipelines, and data analytics applications via a developer-friendly Rancher Kubernetes cluster manager.

Account Administration

As an XSEDE computing resource, Anvil is accessible to XSEDE users who are given time on the system. To obtain an account, users may submit a proposal through the XSEDE Allocation Request System. Interested parties may contact the XSEDE Help Desk for help with an Anvil proposal.

System Architecture

Model: 3rd Gen AMD EPYC™ CPUs (AMD EPYC 7763)
Sockets per node: 2
Cores per socket: 64
Cores per node: 128
Hardware threads per core: 1
Hardware threads per node: 128
Clock rate: 2.45GHz (3.5GHz max boost)
RAM: Regular compute node: 256 GB DDR4-3200
Large memory node: (32 nodes with 1TB DDR4-3200)
Cache: L1d cache: 32K/core
L1i cache: 32K/core
L2 cache: 512K/core
L3 cache: 32768K/CCD
Local storage: 240GB local disk
Number of Nodes Processors per Node Cores per Node Memory per Node
8 3rd Gen AMD EPYC™ 7532 CPU 128 512 GB
Sub-Cluster Number of Nodes Processors per Node Cores per Node Memory per Node
B 32 Two 3rd Gen AMD EPYC™ 7763 CPUs 128 1TB
C 16 Two 3rd Gen AMD EPYC™ 7763 CPUs + Four NVIDIA A100 GPUs 128 512 GB

Network

All nodes, as well as the proposed scratch storage system, will be interconnected by an oversubscribed (3:1 fat tree) HDR InfiniBand interconnect. The nominal per-node bandwidth is 100 Gbps, with message latency as low as 0.90 microseconds. The fabric will be implemented as a two-stage fat tree. Nodes will be directly connected to Mellanox QM8790 switches with 60 HDR100 links down to nodes and 10 links to spine switches.

Anvil will peer with Purdue's wide-area network routers at 400 Gbps and can utilize Purdue's 200 Gbps connection to the Indiana GigaPOP. From the Indiana GigaPOP, Purdue systems can access several key academic and research networks including Internet2, StarLight, ESNet, and the Big 10 OmniPOP via 10 Gbps and 200 Gbps peering arrangements. Anvil's network monitoring will include a perfSONAR node meshed with others to ensure reliable network transfers.

Accessing the System

As an XSEDE computing resource, Anvil is accessible to XSEDE users who are given an allocation on the system. To obtain an allocation, users may submit a proposal through the XSEDE Allocation Request System. Interested parties may contact the XSEDE Help Desk for help with an Anvil proposal.

Anvil will be accessible via the XSEDE Single Sign-On (SSO) hub. When reporting a problem to the help desk, please execute the "gsissh" command with the "-vvv" option and include the verbose output in your problem description.

Anvil also provides easy access to the system through the Open OnDemand portal. Open OnDemand developed by the Ohio Supercomputer Center allows users to interact with HPC resources through a web browser and easily view resource usage, manage files, submit jobs, and interact with graphical applications directly in a browser, all with no software to install on their own machines.

Running Jobs

Users familiar with the Linux command line may use standard job submission utilities to manage and run jobs on the Anvil compute nodes.

Accessing the Compute Nodes

Anvil uses the Slurm Workload Manager for job scheduling and management. With Slurm, a user requests resources and submits a job to a queue. The system takes jobs from queues, allocates the necessary compute nodes, and executes them. While users will typically SSH to an Anvil login node to access the Slurm job scheduler, they should note that Slurm should always be used to submit their work as a job rather than run computationally intensive jobs directly on a login node. All users share the login nodes, and running anything but the smallest test job will negatively impact everyone's ability to use Anvil.

Anvil is designed to serve the moderate-scale computation and data needs of the majority of XSEDE users. Users with allocations can submit to a variety of queues with varying job size and walltime limits. Separate sets of queues are utilized for the CPU, GPU, and large memory nodes. Typically, queues with shorter walltime and smaller job size limits will feature faster turnarounds. Some additional points to be aware of regarding the Anvil queues are:

  • Anvil provides a debug queue for testing and debugging codes.
  • Anvil supports shared-node jobs (more than one job on a single node). Many applications are serial or can only scale to a few cores. Allowing shared nodes improves job throughput, provides higher overall system utilization, and allows more users to run on Anvil.
  • Anvil supports long-running jobs - run times can be extended to four days for jobs using up to 16 full nodes.
  • The maximum allowable job size on Anvil is 7,168 cores. To run larger jobs, submit a consulting ticket to discuss with Anvil support.
  • Shared-node queues will be utilized for managing jobs on the GPU and large memory nodes.

Job Accounting

The charge unit for Anvil is the Service Unit (SU). This corresponds to the equivalent use of one compute core utilizing less than or equal to approximately 2G of data in memory for one hour, or 1 GPU for 1 hour. Keep in mind that your charges are based on the resources that are tied up by your job and do not necessarily reflect how the resources are used. Charges on jobs submitted to the shared queues are based on the number of cores and the fraction of the memory requested, whichever is larger. Jobs submitted as node-exclusive will be charged for all 128 cores, whether the resources are used or not. Jobs submitted to the large memory nodes will be charged 4 SU per compute core (4x standard node charge). The minimum charge for any job is 1 SU. Filesystem storage is not charged.

Queues

Anvil provides different queues with varying job size and walltime. There are also limits on the number of jobs queued and running on a per allocation and queue basis. Queues and limits are subject to change based on the evaluation from the Early User Program.

Queue Name Node Type Max Nodes per Job Max Cores per Job Max Duration Max running Jobs in Queue Charging factor
debug regular 2 nodes 256 cores 2 hrs 1 1
gpu-debug gpu 1 node 2 gpus 0.5 hrs 1 1
normal regular 16 nodes 2,048 cores 96 hrs 50 1
wide regular 56 nodes 7,168 cores 12 hrs 5 1
shared regular 1 node 128 cores 96 hrs 4000 1
highmem large-memory 1 node 128 cores 48 hrs 2 4
gpu gpu 2 node 4 gpus 48 hrs 2 1

Batch Jobs

Job Submission Script

To submit work to a Slurm queue, you must first create a job submission file. This job submission file is essentially a simple shell script. It will set any required environment variables, load any necessary modules, create or modify files and directories, and run any applications that you need:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

# Loads Matlab and sets the application up
module load matlab

# Change to the directory from which you originally submitted this job.
cd $SLURM_SUBMIT_DIR

# Runs a Matlab script named 'myscript'
matlab -nodisplay -singleCompThread -r myscript

Once your script is prepared, you are ready to submit your job.

Name Description
SLURM_SUBMIT_DIR Absolute path of the current working directory when you submitted this job
SLURM_JOBID Job ID number assigned to this job by the batch system
SLURM_JOB_NAME Job name supplied by the user
SLURM_JOB_NODELIST Names of nodes assigned to this job
SLURM_SUBMIT_HOST Hostname of the system where you submitted this job
SLURM_JOB_PARTITION Name of the original queue to which you submitted this job

Submitting a Job

Once you have a job submission file, you may submit this script to Slurm using the sbatch command. Slurm will find, or wait for, available resources matching your request and run your job there.

To submit your job to one compute node with one task:

login1$ sbatch --nodes=1 --ntasks=1 myjobsubmissionfile

By default, each job receives 30 minutes of wall time, or clock time. If you know that your job will not need more than a certain amount of time to run, request less than the maximum wall time, as this may allow your job to run sooner. To request the 1 hour and 30 minutes of wall time:

login1$ sbatch -t 1:30:00 --nodes=1  --ntasks=1 myjobsubmissionfile

Each compute node in Anvil has 128 processor cores. In some cases, you may want to request multiple nodes. To utilize multiple nodes, you will need to have a program or code that is specifically programmed to use multiple nodes such as with MPI. Simply requesting more nodes will not make your work go faster. Your code must utilize all the cores to support this ability. To request 2 compute nodes with 256 tasks:

login1$ sbatch --nodes=2 --ntasks=256 myjobsubmissionfile

If more convenient, you may also specify any command line options to sbatch from within your job submission file, using a special form of comment:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

#SBATCH -A myallocation
#SBATCH --nodes=1
#SBATCH --ntasks=1 
#SBATCH --time=1:30:00
#SBATCH --job-name myjobname

# Print the hostname of the compute node on which this job is running.
/bin/hostname

If an option is present in both your job submission file and on the command line, the option on the command line will take precedence.

After you submit your job with sbatch, it may wait in the queue for minutes, hours, or even days. How long it takes for a job to start depends on the specific queue, the available resources ,and time requested, and other jobs that are already waiting in that queue. It is impossible to say for sure when any given job will start. For best results, request no more resources than your job requires.

Once your job is submitted, you can monitor the job status, wait for the job to complete, and check the job output.

Checking Job Status

Once a job is submitted there are several commands you can use to monitor the progress of the job. To see your jobs, use the squeue -u command and specify your username:

To retrieve useful information about your queued or running job, use the scontrol show job command with your job's ID number.

Checking Job Output

Once a job is submitted, and has started, it will write its standard output and standard error to files that you can read.

Slurm catches output written to standard output and standard error - what would be printed to your screen if you ran your program interactively. Unless you specified otherwise, Slurm will put the output in the directory where you submitted the job in a file named slurm- followed by the job id, with the extension out. For example slurm-3509.out. Note that both stdout and stderr will be written into the same file, unless you specify otherwise.

If your program writes its own output files, those files will be created as defined by the program. This may be in the directory where the program was run, or may be defined in a configuration or input file. You will need to check the documentation for your program for more details.

Redirecting Job Output

It is possible to redirect job output to somewhere other than the default location with the –error and –output directives:

#! /bin/sh -l
#SBATCH --output=/path/myjob.out
#SBATCH --error=/path/myjob.out

# This job prints "Hello World" to output and exits
echo "Hello World"

Interactive Jobs

In addition to the ThinLinc and OnDemand interfaces, users can also choose to run interactive jobs on compute nodes,to obtain a shell that they can interact with. This gives users the ability to type commands or use a graphical interface as if they were on a login node.

To submit an interactive job, use "sinteractive" to run a login shell on allocated resources.

The "sinteractive" command accepts most of the same resource requests as sbatch, so to request a login shell in the compute queue while allocating 2 nodes and 256 total cores, you might do:

login1$ sinteractive -N2 -n256 -A oneofyourallocations

Managing and Transferring Files

File Systems

Anvil provides users with separate home, scratch, and project areas for managing files. These will be accessible via the $HOME, $SCRATCH, and $PROJECT environment variables. Each file system is available from all Anvil nodes, but has different purge policies and ideal use cases (see table below). Users in the same allocation will share access to the data in the $PROJECT space, which will be created upon request for each allocation.

$SCRATCH is a high-performance, internally resilient Lustre parallel file system with 10 PB of usable capacity, configured to deliver up to 150 GB/s bandwidth. The $PROJECT file system is provided through Purdue's existing GPFS Research Data Depot system.

File System Quota Best use Purge policy
$HOME 25GB Area for storing software, scripts compiling, editing Not purged
$SCRATCH 200TB/user, 2M files Area for job I/O activity, temporary storage, Files older than 30-day (access time) will be purged
$PROJECT Up to 1TB/allocation Area for shared data in a project, common datasets and software installation Not purged while allocation is active.
Removed 90 days after allocation expiration

Transferring your Files

Anvil supports several methods for file transfer to and from the system. Users can transfer files between Anvil and Linux-based systems or Mac using either scp or rsync. Windows SSH clients typically include scp-based file transfer capabilities.

  • SCP (Secure CoPy) is a simple way of transferring files between two machines that use the SSH protocol. SCP is available as a protocol choice in some graphical file transfer programs and also as a command line program on most Linux, Unix, and Mac OS X systems. SCP can copy single files, but will also recursively copy directory contents if given a directory name. Multiple SCP/SFTP programs with graphical user interface are available for all operating systems - WinSCP, MobaXterm, Filezilla and Cyberduck to name a few.

  • Rsync, (Remote Sync) is a free and efficient command-line tool that lets you transfer files and directories to local and remote destinations. It allows to copy only the changes from the source and offers customization, usef for mirroring, performing backups, or migrating data between different file systems.

  • Globus is a powerful and easy to use file transfer and sharing service for transferring files virtually anywhere. It works between any XSEDE and non-XSEDE sites running Globus, and it connects any of these research systems to personal systems. You may use Globus to connect to your home, scratch, and project storage directories on Anvil. Since Globus is web-based, it works on any operating system that is connected to the internet. The Globus Personal client is available on Windows, Linux, and Mac OS X. It is primarily used as a graphical means of transfer but it can also be used over the command line. More details can be found in Globus documentation at https://docs.globus.org/how-to/

Software

Module System

The Anvil cluster uses Lmod to manage the user environment, so users have access to the necessary software packages and versions to conduct their research activities. The associated module commands can be used to load applications and compilers, making the corresponding libraries and environment variables automatically available in the user environment.

Lmod is a hierarchical module system, meaning a module can only be loaded after loading the necessary compilers and MPI libraries that it depends on. This helps avoid conflicting libraries and dependencies being loaded at the same time. A list of all available modules on the system can be found with the "module spider" command. The "module spider" command can also be used to search for specific module names.

When users log into Anvil, a default compiler (GCC), MPI libraries (OpenMPI), and runtime environments (e.g., Cuda on GPU-nodes) are automatically loaded into the user environment. It is recommended that users explicitly specify which modules and which versions are needed to run their codes in their job scripts via the "module load" command. Users are advised not to insert "module load" commands in their bash profiles, as this can cause issues during initialization of certain software (e.g. Thinlinc).

Most modules on Anvil include extensive help messages, so users can take advantage of the "module help" command to find information about a particular application or module. Every module also contains two environment variables named "$RCAC_APPNAME_ROOT" and "$RCAC_APPNAME_VERSION" identifying its installation prefix and its version. Users are encouraged to use generic environment variables such as CC, CXX, FC, MPICC, MPICXX etc. available though the compiler and MPI modules while compiling their code.

Compiling, Performance, and Optimization

Anvil CPU nodes have GNU, Intel, and AOCC (AMD) compilers available along with multiple MPI implementations (OpenMPI and Intel MPI). Anvil GPU nodes will also provide the PGI compiler. Users may want to note the following AMD Milan specific optimization options that can help improve the performance of your code on Anvil:

  • The majority of the applications on Anvil will be built using gcc/10.2.0 which features an AMD Milan specific optimization flag (-march=znver2).
  • AMD Milan CPUs support the Advanced Vector Extensions 2 (AVX2) vector instructions set. GNU, Intel, and AOCC compilers all have flags to support AVX2. Using AVX2, up to eight floating point operations can be executed per cycle per core, potentially doubling the performance relative to non-AVX2 processors running at the same clock speed.
  • In order to enable AVX2 support, when compiling your code, use the -march=znver2 flag (for GCC 10.2, Clang and AOCC compilers) or -march=core-avx2 (for Intel compilers and GCC prior to 9.3).

Other Software Usage Notes

  • Use the same environment that you compile the code to run your executables. When switching between compilers for different applications, make sure that you load the appropriate modules before running your executables.
  • Explicitly set the optimization level in your makefiles or compilation scripts. Most well written codes can safely use the highest optimization level (-O3), but many compilers set lower default levels (e.g. GNU compilers use the default -O0, which turns off all optimizations).
  • Turn off debugging, profiling, and bounds checking when building executables intended for production runs as these can seriously impact performance. These options are all disabled by default. The flag used for bounds checking is compiler dependent, but the debugging (-g) and profiling (-pg) flags tend to be the same for all major compilers.
  • Some compiler options are the same for all available compilers on Anvil (e.g. "-o"), while others are different. Many options are available in one compiler suite but not the other. For example, Intel, PGI, and GNU compilers use the -qopenmp, -mp, and -fopenmp flags, respectively, for building OpenMP applications.
  • MPI compiler wrappers (e.g. mpicc, mpif90) all call the appropriate compilers and load the correct MPI libraries depending on the loaded modules. While the same names may be used for different compilers, keep in mind that these are completely independent scripts.

For Python users, Anvil provides two Python distributions:

  1. a natively compiled Python module with a small subset of essential numerical libraries which are optimized for the AMD Milan architecture and
  2. binaries distributed through Anaconda. Users are recommended to use virtual environments for installing and using additional Python packages.

A broad range of application modules from various science and engineering domains will be installed on Anvil, including mathematics and statistical modeling tools, visualization software, computational fluid dynamics codes, molecular modeling packages, and debugging tools. Some of these packages are listed in the table below:

Domain Application Packages and Libraries
Biology/bioinformatics Bioconductor, Samtools, FastQC, Bowtie, BLAST, Trimmomatic, Blat, Tophat, HMMER, Abyss
Chemistry NWChem, GPAW, NAMD, LAMMPS, GROMACS, Amber
Computational Fluid Dynamics OpenFOAM
Material Modeling VASP, Quantum Espresso
Environmental Sciences GDAL, GMT
Math and Statistics MATLAB, Octave, R
Machine Learning TensorFlow, PyTorch, OpenCV, Caffe, Keras, Scikit-learn
Visualization Paraview
I/O libraries HDF5, NetCDF
Math libraries Intel MKL, OpenBLAS, FFTW, PETSc, AOCL
Profiling and Debugging Intel vTune, AMD uProf, Nvidia NVprof, HPCToolkit, Totalview

In addition, Singularity will be supported on Anvil and Nvidia GPU Cloud containers are available on Anvil GPU nodes.

Helpful Tips

We will strive to ensure that Anvil serves as a valuable resource to the national research community. We hope that you the user will assist us by making note of the following:

  • You share Anvil with thousands of other users, and what you do on the system affects others. Exercise good citizenship to ensure that your activity does not adversely impact the system and the research community with whom you share it. For instance: do not run jobs on the login nodes and do not stress the file system.

  • Help us serve you better by filing informative help desk tickets. Before submitting a help desk ticket do check what the user guide and other documentation say. Search the internet for key phrases in your error logs; that's probably what the consultants answering your ticket are going to do. What have you changed since the last time your job succeeded?

  • Describe your issue as precisely and completely as you can: what you did, what happened, verbatim error messages, other meaningful output. When appropriate, include the information a consultant would need to find your artifacts and understand your workflow: e.g. the directory containing your build and/or job script; the modules you were using; relevant job numbers; and recent changes in your workflow that could affect or explain the behavior you're observing.

  • Have realistic expectations. Consultants can address system issues and answer questions about Anvil. But they can't teach parallel programming in a ticket, and may know nothing about the package you downloaded. They may offer general advice that will help you build, debug, optimize, or modify your code, but you shouldn't expect them to do these things for you.

  • Be patient. It may take a business day for a consultant to get back to you, especially if your issue is complex. It might take an exchange or two before you and the consultant are on the same page. If the admins disable your account, it's not punitive. When the file system is in danger of crashing, or a login node hangs, they don't have time to notify you before taking action.