GPU-Blast

From arccwiki
Jump to: navigation, search

GPU-Blast: GPU accelerated version of NCBI-BLAST (Basic Local Alignment Search Tool)
The Basic Local Alignment Search Tool (BLAST) is one of the most widely used bioinformatics tools. The widespread impact of BLAST is reflected in over 110,000 citations that this software has received in the past three decades, and the use of the word "blast" as a verb referring to biological sequence comparison. Any improvement in the execution speed of BLAST would be of great importance in the practice of bioinformatics, and facilitate coping with ever increasing sizes of biomolecular databases.
Using a general-purpose graphics processing unit (GPU), we have developed GPU-BLAST, an accelerated version of the popular NCBI-BLAST (www.ncbi.nlm.nih.gov). In comparison to the sequential NCBI-BLAST, GPU-BLAST is nearly four times faster, while producing identical results.

Module: Example

[]$ module spider gpu-blast
---------------------------------------------------------------------------------------------------------------------------------
  gpu-blast:
---------------------------------------------------------------------------------------------------------------------------------
     Versions:
        gpu-blast/1.1
[]$ module load gpu-blast/1.1

Using:

Basic Command Line:

blastp -help

Test Example: Based on section III. How to use GPU-BLAST as detailed in the README document.

mkdir test
cd test/
mkdir database
cd database/
wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/env_nr.gz
gunzip env_nr.gz
cd ..

mkdir queries
cd queries/
wget http://thales.cheme.cmu.edu/gpublast/queries.tar.gz
tar -xzf queries.tar.gz

rm queries.tar.gz
cd ..
makeblastdb -in database/env_nr -out database/sorted_env_nr -dbtype prot -sort_volumes -max_file_sz 500MB
blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t -method 2 -gpu_blocks 256 -gpu_threads 32
blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t
time ./blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t > gpu_output.txt
time blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t > gpu_output.txt
time blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu f > cpu_output.txt

Note:

  • Please observe the use of the -gpu option to select whether to use a GPU or not. Remember, if you wish to use this software with a gpu then you will have to request an appropriate node within you're job allocation request. If you try running with the gpu option set, but are not on a GPU node, then you will see the following output:
WARNING: There is no available device supporting CUDA. Continuing with the CPU only...

Example Batch:

#!/bin/bash
#SBATCH -J blastp
#SBATCH -t 00:10:00
#SBATCH --mail-type=ALL
#SBATCH --mail-user=<insert-your-email-address>
#SBATCH --account=<insert-your-project-name>
#SBATCH --gres=gpu:1
#SBTACH --partition=moran-gpu

echo "Start:"

module load gpu-blast/1.1
echo "Loaded gpu-blast:"

srun blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t -method 1 -gpu_blocks 128 -gpu_threads 32
echo "Finished Successfully:"


Back to HPC Installed Software