SDSC Data Oasis

System Overview

Overview

SDSC Data Oasis is an on-line, high-performance, Lustre-based storage resource with a 4 PB capacity that is available to all users of SDSC Gordon and Trestles. It was designed to meet the needs of data-intensive research by providing easy-to-use, high capacity, short- to medium-term storage with useable bandwidth on the order of 100 GB/s and latencies that are far lower than near-line and tape-based storage systems. However, it is not an archival system and stored data is single-copy and not backed up.

Data Oasis is divided into several file systems, including local scratch spaces for Trestles and Gordon and a shared, persistent 2 PB Project space that is available to users with an allocation on either machine. All projects on Trestles or Gordon receive a default allocation of 500 GB.

System Configuration

Data Oasis consists of 64 Object Storage Servers (OSSs) each contributing 72 TB of capacity and connected to two Arista 7508 switches via 2×10GigE links.

On Gordon, the Data Oasis Project Storage is mounted via 64 I/O nodes which act also as routers. The Gordon compute nodes are connected using QDR Infiniband to the I/O nodes, and each I/O node is connected to Data Oasis's Arista 7508 switches via 2×10GigE links.

On Trestles, the compute nodes are connected to the filesystem via a Mellanox BX5020 bridge that connects the Infiniband fabric and Data Oasis's Arista switches through twelve 10GigE links.

System Access

Allocations

The default allocation is 500 GB of Project Storage to be shared among all users of a project. Projects that require more than 500 GB of Project Storage must request additional space by sending an email to help@xsede.org. This e-mail from the project PI should provide a 500 words-or-less justification that includes:

  1. Amount of storage being requested
  2. How the storage will be used
  3. How this augments other non-SDSC storage resources available to the project

The projects storage requests will be reviewed by SDSC staff, and a decision will be made within 5 business days.

Methods of Access

The Data Oasis Project Storage space is mounted on both Gordon and Trestles and can be accessed as a filesystem on all login and compute nodes. Each user's personal space can be found in

/oasis/projects/nsf/allocationname/username

where allocationname is the project's six-character allocation name (found by running the show_accounts command) and username is the user's local login name.

Since Data Oasis is mounted as a standard filesystem, UNIX file transfer utilities such as scp, sftp, and rsync can be used for transfers of modest size or scale. To enable the efficient transfer of larger amounts of data, Data Oasis is also mounted on the SDSC data mover servers:

  • trestles-dm.sdsc.edu (for users with accounts on Trestles)
  • oasis-dm.sdsc.edu (for users with accounts on Gordon)

These data movers can be used in conjunction with Globus Online and GridFTP, which are discussed in the Data Transfer Methods section below and the XSEDE Data Transfers & Management page.

Transferring Data to/from SDSC Data Oasis

Data Transfer Methods

While using the standard UNIX file transfer tools (scp, sftp, rsync) is acceptable for simple and small file transfers (< 1 GB) to and from Data Oasis, they cannot realize the maximum performance of the Data Oasis storage resource because of their limited internal buffers and inability to stripe transfers across multiple data mover servers. The preferred method for transferring big data (both large file sizes and large numbers of files) is using GridFTP (a part of the Globus Toolkit). Keep in mind that attempting to transfer large numbers of small files will result in poor performance. Whenever possible, create archives of directories with large file counts before initiating the data transfers.

The XSEDE Data Transfers & Management page provides a detailed explanation of how to use GridFTP and its associated GUI- and terminal-based tools in XSEDE. To facilitate GridFTP with SDSC Data Oasis, the following data movers have Data Oasis mounted under /oasis/projects/nsf:

  • Trestles: gsiftp://trestles-dm.sdsc.edu:2811/ (XUP File Manager/globus-url-copy) or xsede#gordon (Globus Online)
  • Gordon: gsiftp://oasis-dm.sdsc.edu:2811/ (XUP File Manager/globus-url-copy) or xsede#trestles

These data movers are load-balanced in a round-robin fashion, but advanced users may wish to access the individual data movers explicitly via trestles-dm1, trestles-dm2, oasis-dm1, oasis-dm2, oasis-dm3, and oasis-dm4.

Examples

globus-url-copy provides the greatest flexibility for optimizing transfers between XSEDE resources. To transfer a file from another XSEDE resource (e.g., TACC Stampede) to SDSC Gordon,

    $ module load globus
    $ myproxy-login -l xsedeusername

This will load the commands to use GridFTP and generate the GSI credential needed to access xsedeusername's accounts across XSEDE Resources. Then,

    $ globus-url-copy -vb -stripe -tcp-bs 8m -p 4 \
        gsiftp://data1.stampede.tacc.utexas.edu:2811///home1/02255/username/somefile.bin \
        gsiftp://oasis-dm.sdsc.xsede.org:2811///oasis/projects/nsf/allocation/username/somefile.bin

where

  • "-vb" enables verbosity (report transfer rate, among other things)
  • "-stripe" enables striped transfers
  • "-tcp-bs 8m" specifies a 8 megabyte TCP buffer. The optimal value for this will vary; Globus provides a way to estimate the optimal tcp-bs value in its documentation
  • "-p 4" indicates that four parallel data connections should be used

By comparison, the equivalent transfer using scp would be:

    $ scp login1.stampede.tacc.utexas.edu:/home1/02255/username/somefile.bin \
        /oasis/projects/nsf/allocation/username/

In the case of a 341 MB file transfer test case, GridFTP achieved an average 171 MB/s while scp achieved only 34.1 MB/s. When transferring terabytes of data, GridFTP is clearly preferable.

Caveats to Users

This resource is based on a Lustre filesystem which has some limitations. A comprehensive list of Lustre best-practices is beyond the scope of this guide, but it is important to minimize unnecessary access of file metadata. For example,

  • avoid performing many small file operations: opens/closes, random reads/writes
  • avoid putting too many (e.g., more than several hundred) files in one directory
  • avoid using "ls -l" unnecessarily, and consider using "ls --color=no -U" when navigating Data Oasis
  • limit unnecessary use of wildcards on the command line
  • avoid using the "find" and "du" commands. Use "lfs find" and "lfs du" instead

The "lfs" command is available by default on Gordon and can be loaded using "module load lustre" on Trestles.

Troubleshooting / Common errors

  • Problem: Any attempts to access files on Data Oasis just hang OR access is extremely sluggish/unresponsive
    Solution: This can occur on both login nodes and compute nodes and typically results from Data Oasis being overloaded. These conditions typically "un-hang" within a few minutes; if they persist for longer, contact help@xsede.org with the system (or specific compute nodes) on which this is occurring.
  • Problem: /oasis/projects/nsf exists but is empty
    Solution: This problem is infrequent and should be reported to the XSEDE helpdesk with the system (or specific compute nodes) on which this is occurring.

Reference

Policies

SDSC Data Oasis Projects Storage is provided on a per-project basis and is available for the duration of the associated compute allocation period. Data will be retained for three months beyond the end of the project, by which time the data must be migrated elsewhere.

Data Oasis Projects Storage is not subject to automatic purges, but be aware that the data stored there is single-copy and not backed up! Users are responsible for ensuring that critical data are duplicated elsewhere. Data accidentally deleted from Data Oasis cannot be recovered.

Last update: February 27, 2013