HPC system: Teton

From arccwiki
Jump to: navigation, search

The Teton HPC cluster is the follow on to Mount Moran. Teton contains several new compute nodes. All Mount Moran nodes will be reprovisioned within the Teton HPC Cluster. The system is available by SSH using hostname teton.arcc.uwyo.edu or teton.uwyo.edu. We ask that everybody who uses ARCC resources to cite the resources accordingly. See Citing Teton . Newcomers to research computing should also consider reading the Research Computing Quick Reference.

Overview

Teton is a Intel x86_64 cluster interconnected with Mellanox InfiniBand and has a 1.3 PB IBM Spectrum Scale global parallel filesystem available across all nodes. The system requires UWYO two-factor authentication for login via SSH. The default shell is BASH with Lmod modules system is leveraged for dynamic user environments to help switch software stacks rapidly and easily. The Slurm workload manager is employed to schedule jobs, provide submission limits, and implement fairshare as well as provide the Quality of Service (QoS) levels for research groups who have invested in the cluster. Teton has a Digital Object Identifier (DOI) (https://doi.org/10.15786/M2FY47) and we request that all use of Teton appropriately acknowledges the system. Please see Citing Teton for more information.

Available Nodes

Type Series Arch Count Sockets Cores Threads / Core Clock (GHz) RAM (GB) GPU Type GPU Count Local Disk Type Local Disk Capacity (GB) IB Network Operating System
Teton Regular Intel Broadwell x86_64 168 2 32 1 2.1 128 N/A N/A SSD 240 EDR RHEL 7.4
Teton BigMem GPU Intel Broadwell x86_64 8 2 32 1 2.1 512 NVIDIA P100 16G 2 SSD 240 EDR RHEL 7.4
Teton HugeMem Intel Broadwell x86_64 8 2 32 1 2.1 1024 N/A N/A SSD 240 EDR RHEL 7.4
Teton KNL Intel Broadwell x86_64 12 1 72 4 1.5 384 + 16 N/A N/A SSD 240 EDR RHEL 7.4
Teton DGX Intel Broadwell x86_64 1 2 20 2 2.2 512 NVIDIA V100 32G 8 SSD 7 TB EDR Ubuntu 16.04 LTS

See Partitions for information regarding Slurm Partitions on Teton.

Global Filesystems

The Teton cluster filesystem is configured with ~160 TB SSD tier for active data and 1.2 PB HDD capacity tier. The system policy engine moves data automatically between pools. The system will automatically migrate data to HDD when the SSD tier reaches 70% used capacity. Teton has several spaces that are available for users to access.

  • home - /home/username ($HOME)
- Space for configuration files and software installations. This file space is intended to be small and always resides on SSDs. The /home file space is snapshotted to recover from accidental deletions.
  • project - /project/project_name/[username]
- Space to collaborate among project members. Data here is persistent and is exempt from purge policy.
  • gscratch - /gscratch/username ($SCRATCH)
- Space to perform computing for individual users. Data here is subject to a purge policy defined below. Warning emails will be sent when possible deletions may start to occur. No snapshots.
Global Filesystems
Filesystem Quota (GB) Snapshots Backups Purge Policy Additional Info
home 25 Yes No No Always on SSD
project 1024 No No No Aging data will move to HDD
gscratch 5120 No No Yes Aging data will move to HDD

Purge Policy - File spaces within the Teton cluster filesystem may be subject to a purge policy. The policy has not yet been defined. However, ARCC reserves the right to purge data in this area after 30 to 90 days of no access or from creation time. Before performing an actual purge event, the owner of the file(s) will be notified by email several times for files which are subject to being purged.

Special Filesystems

Certain filesystems exist on different nodes of the cluster where specialized requirements exist. The table below summarizes these specialized filesystems.

Specialty Filesystems
Filesystem Mount Location Notes
petaLibary /petaLibrary/homes Only on login nodes
/petaLibrary/Commons Only on login nodes
Bighorn /bighorn/home Only on login nodes, read-only
/bighorn/project Only on login nodes, read-only
/bighorn/gscratch Only on login nodes, read-only
node local scratch /lscratch Only on compute nodes; Moran is 1 TB HDD; Teton is 240 GB SSD
memory filesystem /dev/shm RAM based tmpfs available as part of RAM for very rapid I/O operations; small capacity

The node local scratch or lscratch filesystem is purged at the end of each job.

The memory filesystems can really enhance performance of small I/O operations. If you have localized single node I/O jobs that have very intensive random access patterns, this filesystem may improve performance of your compute job.

The petaLibrary filesystems are only available from the login nodes, not on the compute nodes. A storage space on the Teton global filesystems does not imply storage space on the ARCC petaLibrary or vice versa. For more information about the petaLibrary please see the following link petaLibrary

System Access

SSH Access

Teton has two login nodes for users to access the Teton cluster. Login nodes are available publicly using the hostname teton.arcc.uwyo.edu or teton.uwyo.edu. SSH can be done natively on MacOS or Linux based operating systems using the terminal and the ssh command. Although X11 forwarding is supported, and if you need graphical support, we recommend using FastX if at all possible. Additionally, you may want to configure your OpenSSH client to support connection multiplexing if you require multiple terminal sessions. For those instances where you have unreliable network connectivity, you may want to use either tmux or screen once you login to keep sessions alive during disconnects. This will allow you to later reconnect to these sessions.

ssh USERNAME@teton.arcc.uwyo.edu
ssh -l USERNAME teton.arcc.uwyo.edu
ssh -Y -l USERNAME teton.arcc.uwyo.edu                          # For secure forwarding of X11 displays
ssh -X -l USERNAME teton.arcc.uwyo.edu                          # For forwarding of X11 displays

OpenSSH Configuration File (BSD,Linux,MacOS)

By default, the OpenSSH user configuration file is $HOME/.ssh/config which can be edited to enhance workflow. Since Teton uses round-robin DNS to provide access to two login nodes and requires two-factor authentication, it can be advantageous to add SSH multiplexing to your local environment to make sure subsequent connections are made to the same login node. This also provides a way to shorten up the hostname and access methods for SCP/SFTP/Rsync capabilities. An example entry looks like where USERNAME would be replaced by your actual UWYO username:

Host teton
  Hostname teton.arcc.uwyo.edu
  User USERNAME
  controlmaster auto
  controlpath ~/.ss/ssh-%r@%h:%p

WARNING: While ARCC allows SSH multiplexing, other research computing sites may not. Do not assume this will always work on systems not administered by ARCC.

Access from Microsoft Windows

ARCC currently recommends that users install MobaXterm to access the Teton cluster. It provides appropriate access to the system with SSH and SFTP capability, allowing X11 if required. The home version of MobaXterm should be sufficient. There is also PuTTY if a more minimal application is desired.

Addtional options include, a cygwin installation with SSH installed or the Windows Subsystem for Linux with an OpenSSH client installed on very recent versions of windows, enabling the OpenSSH client. Finally, a great alternative is to use our FastX capability.

FastX Access

If your currently on the UW campus, you can also leverage FastX to provide you with a more robust remote graphics capability via a installable client for Windows, Mac, or Linux or through a web browser. Navigate to https://fastx.arcc.uwyo.edu and log in with your 2FA credentials. There are also native clients for FastX for Windows, MacOS, and Linux which can be downloaded here. For more information, see the documentation on using FastX.

Available Shells

Teton has several shells available for use. The default is bash]. To change your default shell, please submit the request through standard ARCC request methods.

Shell Path Version Notes
bash /bin/bash 4.2.46 Recommended
zsh /bin/zsh 5.0.2
csh /bin/csh 6.18.01 Implemented by TCSH
tcsh /bin/tcsh 6.18.01

Data Transfer & Access

The login nodes (i.e., teton.arcc.uwyo.edu) support the use of SCP, SFTP, and Rsync. One may be interested in doing SSH multiplexing if often, small transfers are required when working with current SSH connections. For large files (>5GB) the Globus GridFTP software should be leveraged to move data. The Globus GridFTP software offers many advantages for transferring large data.

Teton Cluster Filesystem

The Teton filesystem is available on all cluster nodes via native protocol that leverages the high speed InfiniBand network. Alternatively, ARCC supports accessing the filesystem via SMB/CIFS from Windows, MacOS, and Linux when needing to access the data from outside the cluster, but still on the UW campus. Please see the documentation on accessing the filesystem.

SMB / CIFS Access

One can use SMB to connect to the Teton storage using the UW credentials provided that connection is physically happening on campus.

Operating System Path
Windows \\teton-cifs.arcc.uwyo.edu\
MacOS smb://teton-cifs.arcc.uwyo.edu/
Linux smb://teton-cifs.arcc.uwyo.edu/

There are shares available for /home, /project, and /gscratch

In Red Hat Enterprise Linux 7 and Ubuntu 16.04 or later, the samba configuration file is required to be modified by editing /etc/samba/smb.conf section [Global]] and adding:

 client max protocol = SMB3 

NFS Access

NFS access to the Teton cluster filesystem is available only to ARCC administered systems that reside within the data center. NFS is not provided to user computers or workstations which IT administers.

ARCC Bighorn (Mt Moran) Filesystem

The ARCC Bighorn filesystem which has delivered the storage for Mount Moran for close to 5 years now is available on the teton login nodes. The filesystem is mounted as read-only to avoid any pushing of data from Teton cluster filesystems to Bighorn. Once Mount Moran hardware is migrated to Teton, the Bighorn storage system will be repurposed, but all data will be removed from the system. It is recommended that any data that needs to be kept for extended periods of time or future computing, be migrated to the ARCC petaLibrary or into the Teton filesystems respectively.

  • Bighorn home - /bighorn/home
  • Bighorn project - /bighorn/project
  • Bighorn scratch - /bighorn/scratch

As mentioned above, these filesystem are only available on the Teton login nodes and are read-only from Teton. These filesystems are not expected to be available beyond October 2018.

ARCC petaLibrary Filesystem

The ARCC petaLibrary is available from the Teton login nodes. The ARCC petaLibrary is a capacity designed system and preference is to not compute against that filesystem, but move data to the Teton cluster filesystem locations and compute, perform any data reductions and compressions, and transfer final results on the appropriate petaLibrary share. PetaLibrary allocations are by contracts only; a teton storage space doesn't provide a petaLibrary allocation in either the homes space or the Commons space. Please see petaLibrary for more information.

Do not make symbolic links on the Teton storage to the petaLibrary or vice versa as they'll be invalid anywhere except on the Teton login nodes. There are two accessible points of the petaLibrary.

  • Homes - /petalibrary/homes
  • Commons - /petalibrary/Commons

The Teton cluster leverages NFSv4 ACL to support data movement to and from the ARCC petaLibrary. It's important to note that commands chown or chmod cannot be issued on these locations to avoid the losing the Windows ACLs that grant access to others.

Job Scheduling Slurm

The Teton cluster uses the Slurm Workload Manager to schedule jobs, control resource access, provide fairshare, implement preemption, and provide record keeping. All compute activity should be used from within a Slurm resource allocation (i.e., job). Teton is a condominium resource and as such, investors do have priority on invested resources. This is implemented through preemption and jobs not associated with the investment could be requeued on the system when investor submits jobs. However, if the investor chooses not to implement preemption on their resources, ARCC can disable preemption while offering next in line access if that mode is preferred.

There are also default concurrent limits in place to prevent individual project accounts and users from saturating the cluster away from others. The default limits are listed below. To incentivize investments into the condo system, investors will have their limits increased.

The system also leverages a fairshare mechanism to offer a mechanism for projects that execute jobs on a more rare occasion priority over those who continuously run jobs on the system. To incentivize investments into the condo system, investors will have their fairshare value increased as well.

Finally, individual jobs occur runtime limits based on a study that was performed in ~2014 such that our maximum walltime for a compute job is 7 days. ARCC is currently evaluating this to determine whether the orthogonal limits of CPU count and walltime are optimal operational modes. ARCC is considering concurrent usage limits based on a relational combination of CPU count, Memory, and walltime that would allow more flexibility for different areas of science. There will likely still be an upper limit on individual compute job walltime as ARCC will not allow infinite job walltime and due to possible hardware faults.

Required Inputs and Default Values and Limits

There are some default limits set for Slurm jobs. By default the following is required for submission:

  1. Walltime limit
    (--time=[days-hours:mins:secs]
  2. Project account
    --account=account

Default Values

Additionally, the default submission has the following characteristics:

nodes 
is for one node (-N 1, --nodes=1)
task count 
one tasks (-n 1, --ntasks-per-node=1)
memory amount 
1000 MB RAM / CPU (--mem-per-cpu=1000).

These can be changed by requesting different allocation schemes by modifying the appropriate flags. Please reference our Slurm documentation.

Default Limits

On Mount Moran, the default limits were specifically represented by concurrent used cores by each project account. Investors received an increase concurrent core usage capability. To facilitate more flexible scheduling for all research groups, ARCC is looking at implementing limits based on concurrent usage of cores, memory, and walltime of jobs. This will be defined in the near future and will be subject to the FAC review.

Partitions

The Slurm configuration on Teton is quite complicated to help with the layout of hardware, investors, and runtime limits. The following tables represents the partition on Teton. Some require a QoS which will be auto-assigned during job submission. The tables represent the Slurm allocatable units rather than hardware units.

Teton General Slurm Partitions
Partition Max Walltime Node Cnt Core Cnt Thds / Core CPUS Mem (MB) / Node Req'd QoS
teton 7-00:00:00 180 5760 1 5760 128000 N/A
teton-gpu 7-00:00:00 8 256 1 256 512000 N/A
teton-hugemem 7-00:00:00 8 256 1 256 1024000 N/A
teton-knl 7-00:00:00 12 864 4 3456 384000 N/A

Investor Partitions

Investor partitions are likely to be quite heterogeneous and may have a mix of hardware and are indicated below where appropriate. They require a special QoS for access.

Teton Investor Slurm Partitions
Partition Max Walltime Node Cnt Core Cnt Thds / Core Mem (MB) / Node Req'd QoS Preemption Owner
t-inv-microbiome 7-00:00:00 88 2816 1 128000 TODO Disabled EPSCoR

Special Partitions

Special partitions require access to be given directly to user accounts or project accounts and likely require additional approval for access.

Partition Max Walltime Node Cnt Core Cnt Thds / Core Mem (MB) / Node Owner Notes
dgx 7-00:00:00 1 40 2 512000 EvolvingAI Lab NVIDIA V100 with NVLink, Ubuntu 16.04

Quick Links

Here are some quick links (soon) to some additional documentation on using the system

Base Operations

Workflow Software

  • SSH Connection Multiplexing
  • Software Multiplexers - Keep your sessions alive

Quick How-Tos

  • Compiling Applications and Libraries with Intel compilers
  • Compiling Applications and Libraries with GCC compilers
  • Compiling Applications and Libraries with PGI compilers
  • Using Octave
  • Using Python
  • Using R

Extra Help

  • Requesting software builds
  • Requesting project accounts
  • Requesting user accounts
  • Requesting class accounts
  • Requesting increased storage allocation