Homepage: MUMmer: Version 4.0.0 beta2: A system for rapidly aligning large DNA sequences to one another
MUMmer is very fast and easy to run. The current version, release 4.x, can find all 20-bp maximal exact matches between two bacterial genomes in just a few seconds on a typical desktop or laptop computer. MUMmer handles the 100s or 1000s of contigs from a draft genome with ease, and will align them to another set of contigs using the nucmer utility included with the system. The promer utility takes this a step further by generating alignments based upon the six-frame translations of both input sequences.
$ module spider mummer ------------------------ mummer: mummer/4.0 ------------------------ This module can be loaded directly: module load mummer/4.0
module load mummer/4.0
Based on simple checks as defined in the Manual.
$ module load mummer/4.0 $ nucmer --help Usage: nucmer [options] ref:path qry:path+ nucmer generates nucleotide alignments between two mutli-FASTA input files. The out.delta output file lists the distance between insertions and deletions that produce maximal scoring alignments between each sequence. The show-* utilities know how to read this format. By default, nucmer uses anchor matches that are unique in in the reference but not necessarily unique in the query. See --mum and --maxmatch for different bevahiors. Options (default value in (), *required): --mum Use anchor matches that are unique in both the reference and query (false) --maxmatch Use all anchor matches regardless of their uniqueness (false) -b, --breaklen=uint32 Set the distance an alignment extension will attempt to extend poor scoring regions before giving up (200) -c, --mincluster=uint32 Sets the minimum length of a cluster of matches (65) -D, --diagdiff=uint32 Set the maximum diagonal difference between two adjacent anchors in a cluster (5) -d, --diagfactor=double Set the maximum diagonal difference between two adjacent anchors in a cluster as a differential fraction of the gap length (0.12) --noextend Do not perform cluster extension step (false) -f, --forward Use only the forward strand of the Query sequences (false) -g, --maxgap=uint32 Set the maximum gap between two adjacent matches in a cluster (90) -l, --minmatch=uint32 Set the minimum length of a single exact match (20) -L, --minalign=uint32 Minimum length of an alignment, after clustering and extension (0) --nooptimize No alignment score optimization, i.e. if an alignment extension reaches the end of a sequence, it will not backtrack to optimize the alignment score and instead terminate the alignment at the end of the sequence (false) -r, --reverse Use only the reverse complement of the Query sequences (false) --nosimplify Don't simplify alignments by removing shadowed clusters. Use this option when aligning a sequence to itself to look for repeats (false) -p, --prefix=PREFIX Write output to PREFIX.delta (out) --delta=PATH Output delta file to PATH (instead of PREFIX.delta) --sam-short=PATH Output SAM file to PATH, short format --sam-long=PATH Output SAM file to PATH, long format --save=PREFIX Save suffix array to files starting with PREFIX --load=PREFIX Load suffix array from file starting with PREFIX --batch=BASES Proceed by batch of chunks of BASES from the reference -t, --threads=NUM Use NUM threads (# of cores) -U, --usage Usage -h, --help This message --full-help Detailed help -V, --version Version $ promer --help USAGE: promer [options] <Reference> <Query> DESCRIPTION: promer generates amino acid alignments between two mutli-FASTA DNA input files. The out.delta output file lists the distance between insertions and deletions that produce maximal scoring alignments between each sequence. The show-* utilities know how to read this format. The DNA input is translated into all 6 reading frames in order to generate the output, but the output coordinates reference the original DNA input. MANDATORY: Reference Set the input reference multi-FASTA DNA file Query Set the input query multi-FASTA DNA file OPTIONS: --mum Use anchor matches that are unique in both the reference and query --mumcand Same as --mumreference --mumreference Use anchor matches that are unique in in the reference but not necessarily unique in the query (default behavior) --maxmatch Use all anchor matches regardless of their uniqueness -b|breaklen Set the distance an alignment extension will attempt to extend poor scoring regions before giving up, measured in amino acids (default 60) -c|mincluster Sets the minimum length of a cluster of matches, measured in amino acids (default 20) --[no]delta Toggle the creation of the delta file (default --delta) --depend Print the dependency information and exit -d|diagfactor Set the clustering diagonal difference separation factor (default .11) --[no]extend Toggle the cluster extension step (default --extend) -g|maxgap Set the maximum gap between two adjacent matches in a cluster, measured in amino acids (default 30) -h --help Display help information and exit. -l|minmatch Set the minimum length of a single match, measured in amino acids (default 6) -m|masklen Set the maximum bookend masking lenth, measured in amino acids (default 8) -o --coords Automatically generate the original PROmer1.1 ".coords" output file using the "show-coords" program --[no]optimize Toggle alignment score optimization, i.e. if an alignment extension reaches the end of a sequence, it will backtrack to optimize the alignment score instead of terminating the alignment at the end of the sequence (default --optimize) -p|prefix Set the prefix of the output files (default "out") -V --version Display the version information and exit -x|matrix Set the alignment matrix number to 1 [BLOSUM 45], 2 [BLOSUM 62] or 3 [BLOSUM 80] (default 2) $ annotate --help Usage: annotate <gapfile> <datafile> $ combineMUMs --help combineMUMs: invalid option -- '-' Unrecognized option -- USAGE: combineMUMs <RefSequence> <MatchSequences> <GapsFile> Combines MUMs in <GapsFile> by extending matches off ends and between MUMs. <RefSequence> is a fasta file of the reference sequence. <MatchSequences> is a multi-fasta file of the sequences matched against the reference Options: -D Only output to stdout the difference positions and characters -n Allow matches only between nucleotides, i.e., ACGTs -N num Break matches at <num> or more consecutive non-ACGTs -q tag Used to label query match -r tag Used to label reference match -S Output all differences in strings -t Label query matches with query fasta header -v num Set verbose level for extra output -W file Reset the default output filename witherrors.gaps -x Don't output .cover files -e Set error-rate cutoff to e (e.g. 0.02 is two percent)
- This software is dependent on the following modules:
module load mummer/4.0line will automatically load these modules for you.
Back to HPC Installed Software