From Algorithm Development Wiki
Jump to: navigation, search

WARNING: this page is incomplete



gem-map-2-map -- A program to reprocess records in GEM alignment format


gem-map-2-map [OPTIONS] [-i input_file | -m input_file_1...] [-o output_prefix]


The following options are available:

Options relevant to input (only one of -i and -m can be specified)

-i input_sequences (string, default=stdin)
The file in GEM MAP format that you want to reprocess. This option is incompatible with option -m.

Options relevant to alignment refinement

-I|--index index_file (string, mandatory with options -r or -v)
The name of the GEM archive which should be used for realignment/correctness checks.
Check correctness of the alignments contained in the input.
Realign simplified alignments in the output, annotating them with additional information (like substitutions) which might have been stripped out during previous mapping/processing stages.

Options relevant to output

-o|--output output_file (string, default=stdout)
The name of the generated GEM MAP file.

Options relevant to multi-threading

--lines-per-block lines (non-negative number, default=50000)
During processing, the input file is split into chunks; this option specifies their size. Relevant only when multi-threaded conversion is performed.
-T|--threads threads (non-negative number, default=1)
The number of threads to be used when reprocessing.

Commands (are executed in the specified order, and can be repeated)

-s|--score rule_1, ...; with; rule := sign what; sign := + | -; what := U | u | S | s | e | a | h | b | i | t
Annotate the alignments found in the input with the specified score (up to 16 bits). Each element in the list specifies a bitfield, whose width and content varies depending on the chosen rule. The score is composed by concatenating, from left to right, the bitfields appearing in the list. When computing the score on a specific alignment, the bitfield will be filled with the appropriate value, depending on the rule chosen and the alignment. If the rule is preceded by a + sign, the bitfield is unchanged; if the rule is preceded by a - sign, the bitfield is bitwise negated before concatenation.
The widths and meaning of the possible rules are as follows:
    • U (1 bit): Is the alignment unique within the number of specified mismatch strata?
    • u (1 bit): Is the alignment unique within the best stratum?
    • S (2 bits): The number of strata spanned by the alignments for this query
    • s (2 bits): The number of strata between the one the considered alignment belongs to, and the best one
    • e (1 bit): Does the alignment being considered belong to an exhaustive stratum?
    • a (2 bits): The average quality of nucleotide substitutions in the alignment
    • h (2 bits): The highest quality of nucleotide substitutions in the alignment
    • b (3 bits): If one stratifies the alignments for this query by number of aligned bases, how many strata are there from this alignment to the best one?
    • i (3 bits): If one stratifies the alignments for this query by cumulative intron length, how many strata are there from this alignment to the best one?
    • t (1 bit): Are there other alignments for this query with the same score?
All the quantities are suitably rescaled or truncated to the specified number of bits (for instance, for the s option one will have that distances >= 3 are transformed to 3).
For instance, the score
(re)sorts the existing alignments by decreasing number of aligned bases, increasing highest quality of nucleotide substitutions, increasing average quality of nucleotide substitutions, and decreasing strata distance with respect to the best match.
On the other hand, the score
maximizes uniqueness.
Print help information and exit without doing anything else.


Paolo Ribeca


gem-mapper, gem-rna-mapper, and The GEM website.

Personal tools