Data transfer: Globus

From arccwiki
Jump to: navigation, search

Globus manages file transfers between two machines (two servers, a server and a personal machine, or two personal machines). It is ideal for large files and available for many institutional clusters and networks. Files can be transferred between Mount Moran/BigHorn, the ARCC petaLibrary, ORNL, TACC, NCAR, and other institutions around the world.

When the amount of data to transfer exceeds around 100 GB, other methods like scp, sftp, rsync may be too slow, and Globus will be faster for transferring collections of files due to doing so in parallel.

Globus Advantages

  • Secure, handles errors, verifies integrity of transferred files
  • Automatically resumes after interruption
  • Emails user when transfer is complete, or when error occurs
  • Accelerated transfer rates — transfers in parallel where available
  • Web and command line interfaces for transfer
  • Links to major HPC sites (ORNL, TACC, NCAR, etc.)

Terminology

  • A Globus endpoint is a location that data can be transferred to/from.
    • A server endpoint is set up by an administrator to provide access to a system via Globus.
    • A personal endpoint is set up by an individual (using the Globus Connect Personal software from the link under "References" below) to transfer between their personal machine and other endpoints.

Endpoints

To utilize Globus with ARCC resources and other common ones, the following public Globus endpoints can be used:

System Name Globus Endpoint Name
ARCC Teton ARCC Teton
ARCC Bighorn (storage system for Mount Moran) ARCC Bighorn
ARCC petaLibrary ARCC petaLibrary
NCAR GLADE (XSEDE) (Yellowstone and Cheyenne) XSEDE NCAR GLADE
University of Utah search for "uofuchpc" to get list
CI-Water project University of Utah - CIWater Dedicated Endpoint

Using Globus at UW

References