From Algorithm Development Wiki
Jump to: navigation, search



gem-2-sam -- A converter from GEM mapping format to SAM alignment format


gem-2-sam [OPTIONS] [-i input_file] [-o output_prefix]


The following options are available:

-I|--index index_file (string)
The name of the GEM archive which should be used to deduce the SAM header. It ought to coincide with the archive used at mapping time. If this option is not specified, only a minimal header will be emitted, and some downstream tools will likely complain.
-i input_sequences (string, default=stdin)
The file in GEM MAP format that you want to convert to SAM. Please note that due to the many limitations of the SAM format the conversion entails a loss of information.
--expect-single-end-reads =item --expect-paired-end-reads
Use these options to override automatic single-end/paired-end detection. To get automatic detection of paired-end reads, you should interlace them in your original FASTA/FASTQ file (putting the sequences of both ends in a single file, where first and second end alternate), and the read names should follow this convention:
end_1: common_name[/1 | |1 | ...] end_2: common_name[/2 | |2 | ...]
where any non-numeric character before 1 or 2 is accepted. For instance, two consecutive sequences named
My_paired_end_read_#123456/1 My_paired_end_read_#123456/2
would be recognized as the two ends of a paired-end read named My_paired_end_read_#123456. Same for the two sequences
My_paired_end_read_#123456*1 My_paired_end_read_#123456*2
-o|--output output_file (string, default=stdout)
The name of the generated SAM file.
--read-group field_1, ...; with; field := name = value; name := ID | CN | DS | DT | FO | KS | LB | PG | PI | PL | PU | SM; (default: ID=0, PG=GEM, PL=ILLUMINA, SM=0)
Specifies the RG (read group) the reads being converted belong to. Fields ID and SM must not be empty. Please refer to the SAM format specification for more information. No check is performed on the values supplied, so the generated SAM file might be invalid if you are not careful.
--comment comment (string)
Add the specified comment to the SAM header.
In paired-end mode, the converter usually emits two lines of SAM output (one line in single-end mode) per each alignment found. If -c option is specified, all the alignments will be output as a single line. In the latter case, the alignment with the highest score, if any, or the first one otherwise, will be the only one fully displayed as a SAM record; the additional alignments will be listed, keeping their original order, in the XA optional field. Although this option mimics the behavior of other popular aligners, it is discouraged, as it entails a loss of information for all the alignments but the first one.
Emit correct flags for unpaired reads when any of the fragments has no primary alignment. This will break the compliance of the generated file with the PICARD checker.
--lines-per-block lines (non-negative number, default=50000)
During processing, the input file is split into chunks; this option specifies their size. Relevant only when multi-threaded conversion is performed.
-T|--threads threads (non-negative number, default=1)
The number of threads to be used when converting.
Print help information and exit without doing anything else.


Paolo Ribeca


gem-mapper, gem-rna-mapper, and the GEM website.

Personal tools