HPC system: Data Transfer & Access

From arccwiki
Jump to: navigation, search

Transfer

The login nodes (i.e., teton.arcc.uwyo.edu) support the use of SCP, SFTP, and Rsync. One may be interested in doing SSH multiplexing if often, small transfers are required when working with current SSH connections. For large files (>5GB), the Globus GridFTP software should be leveraged to move data. The Globus GridFTP software offers many advantages for transferring large data.

Teton Cluster Filesystem

The Teton filesystem is available on all cluster nodes via native protocol that leverages the high speed InfiniBand network. Alternatively, ARCC supports accessing the filesystem via SMB/CIFS from Windows, MacOS, and Linux when needing to access the data from outside the cluster, but still on the UW campus. Please see the documentation on accessing the filesystem.

SMB / CIFS Access

One can use SMB to connect to the Teton storage using the UW credentials provided that connection is physically happening on campus.

Operating System Path
Windows \\teton-cifs.arcc.uwyo.edu\
MacOS smb://teton-cifs.arcc.uwyo.edu/
Linux smb://teton-cifs.arcc.uwyo.edu/

There are shares available for /home, /project, and /gscratch

In Red Hat Enterprise Linux 7 and Ubuntu 16.04 or later, the samba configuration file is required to be modified by editing /etc/samba/smb.conf section [Global]] and adding:

 client max protocol = SMB3 

NFS Access

NFS access to the Teton cluster filesystem is available only to ARCC administered systems that reside within the data center. NFS is not provided to user computers or workstations which IT administers.

ARCC Bighorn (Mt Moran) Filesystem

The ARCC Bighorn filesystem which has delivered the storage for Mount Moran for close to 5 years now is available on the teton login nodes. The filesystem is mounted as read-only to avoid any pushing of data from Teton cluster filesystems to Bighorn. Once Mount Moran hardware is migrated to Teton, the Bighorn storage system will be repurposed, but all data will be removed from the system. It is recommended that any data that needs to be kept for extended periods of time or future computing, be migrated to the ARCC petaLibrary or into the Teton filesystems respectively.

  • Bighorn home - /bighorn/home
  • Bighorn project - /bighorn/project
  • Bighorn scratch - /bighorn/scratch

You can therefore use the standard "cp" command to move file from Bighorn to Teton, i.e.

cp /bighorn/home/<user id>/<some file> /home/>user id>

To move a directory and all sub-directories use:

cp -R /bighorn/home/<user id>/<some directory> /home/>user id>

As mentioned above, these filesystem are only available on the Teton login nodes and are read-only from Teton. These filesystems are not expected to be available beyond October 2018.

ARCC petaLibrary Filesystem

The ARCC petaLibrary is available from the Teton login nodes. The ARCC petaLibrary is a capacity designed system and preference is to not compute against that filesystem, but move data to the Teton cluster filesystem locations and compute, perform any data reductions and compressions, and transfer final results on the appropriate petaLibrary share. PetaLibrary allocations are by contracts only; a teton storage space doesn't provide a petaLibrary allocation in either the homes space or the Commons space. Please see petaLibrary for more information.

Do not make symbolic links on the Teton storage to the petaLibrary or vice versa as they'll be invalid anywhere except on the Teton login nodes. There are two accessible points of the petaLibrary.

  • Homes - /petalibrary/homes
  • Commons - /petalibrary/Commons

The Teton cluster leverages NFSv4 ACL to support data movement to and from the ARCC petaLibrary. It's important to note that commands chown or chmod cannot be issued on these locations to avoid the losing the Windows ACLs that grant access to others.