Difference between revisions of "Kraken"

From arccwiki
Jump to: navigation, search
Line 23: Line 23:
 
'''Note''': Under the [https://ccb.jhu.edu/software/kraken2/index.shtml?t=manual#system-requirements System Requirements] within the '''Dependencies''' section, it talks about ''Multithreading is handled using OpenMP. ... Unlike Kraken 1, Kraken 2 does not use an external k-mer counter. However, by default, Kraken 2 will attempt to use the dustmasker or segmasker programs provided as part of NCBI's BLAST suite to mask low-complexity regions (see Masking of Low-complexity Sequences).''
 
'''Note''': Under the [https://ccb.jhu.edu/software/kraken2/index.shtml?t=manual#system-requirements System Requirements] within the '''Dependencies''' section, it talks about ''Multithreading is handled using OpenMP. ... Unlike Kraken 1, Kraken 2 does not use an external k-mer counter. However, by default, Kraken 2 will attempt to use the dustmasker or segmasker programs provided as part of NCBI's BLAST suite to mask low-complexity regions (see Masking of Low-complexity Sequences).''
 
<br/>
 
<br/>
 +
===Example 1:===
 
Example based on [https://ccb.jhu.edu/software/kraken2/index.shtml?t=manual#standard-kraken-2-database Standard Kraken 2 Database]. With respect to the above, you'll notice in the example below that it also uses the <code>gpu-blast/1.1</code>
 
Example based on [https://ccb.jhu.edu/software/kraken2/index.shtml?t=manual#standard-kraken-2-database Standard Kraken 2 Database]. With respect to the above, you'll notice in the example below that it also uses the <code>gpu-blast/1.1</code>
 
<pre>
 
<pre>

Revision as of 19:55, 10 October 2019

KRAKEN2: Version 2.0.8: Kraken taxonomic sequence classification system
Kraken 2 is the newest version of Kraken, a taxonomic classification system using exact k-mer matches to achieve high accuracy and fast classification speeds. This classifier matches each k-mer within a query sequence to the lowest common ancestor (LCA) of all genomes containing the given k-mer. The k-mer assignments inform the classification algorithm.
Manual

Module: Example

[]$ module spider kraken
-------------------------
  kraken:
-------------------------
     Versions:
        kraken/1.0-py27
        kraken/2.0
module load kraken/2.0

Using:

Note: Under the System Requirements within the Dependencies section, it talks about Multithreading is handled using OpenMP. ... Unlike Kraken 1, Kraken 2 does not use an external k-mer counter. However, by default, Kraken 2 will attempt to use the dustmasker or segmasker programs provided as part of NCBI's BLAST suite to mask low-complexity regions (see Masking of Low-complexity Sequences).

Example 1:

Example based on Standard Kraken 2 Database. With respect to the above, you'll notice in the example below that it also uses the gpu-blast/1.1

[]$ salloc -A <enter-your-project> --time=6:00:00 -N 1 --cpus-per-task=32 --mem=0
[]$ module load kraken/2.0
[]$ module load gpu-blast/1.1
[]$ srun kraken2-build --standard --threads 32 --db KDB

Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
Processed 341 projects (530 sequences, 872.20 Mbp)... done.
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library... done.
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
Processed 17072 projects (36839 sequences, 68.61 Gbp)... done.
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library..

Notes:

  • The threads option value must match the cpus-per-task value.
  • If you do not load the gpu-blast/1.1 module, you will see the error below:
Downloading taxonomy tree data... done.
Untarring taxonomy tree data... done.
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
Processed 341 projects (530 sequences, 872.20 Mbp)... done.
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library...which: no dustmasker in (/pfs/tsfs1/apps/el7-x86_64/u/gcc/7.3.0/kraken/2.0/kraken2/bin:/pfs/tsfs1/apps/el7-x86_64/u/gcc/7.3.0/kraken/2.0/kraken2/bin:/apps/u/gcc/7.3.0/kraken/2.0/kraken2/bin:/apps/u/gcc/4.8.5/gcc/7.3.0-xegsmw4/bin:/apps/s/arcc/0.1/bin:/apps/s/slurm/18.08/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/apps/u/opt/singularity/2.5.2/bin:/home/salexan5/.local/bin:/home/salexan5/bin)
Unable to find dustmasker in path, can't mask low-complexity sequences


  • This software is dependent on the following modules:
    • swset/2018.05
    • gcc/7.3.0
    • The module load kraken/2.0 line will automatically load these modules for you.


Back to HPC Installed Software