Many small reads are produced by high-throughput "next generation"
sequencing technologies. The final sequence is derived from how
these reads are overlapping towards a consensus.
The more reads are covering/confirming parts of a nucleotide seq,
the higher the confidence is. Too many reads would be indicative
of e.g. repeats in the genome.
.
mosdepth can output:
* per-base depth about 2x as fast samtools depth--about 25 minutes
of CPU time for a 30X genome.
* mean per-window depth given a window size--as would be used for
CNV calling.
* the mean per-region given a BED file of regions.
* a distribution of proportion of bases covered at or above a given
threshold for each chromosome and genome-wide.
* quantized output that merges adjacent bases as long as they fall
in the same coverage bins e.g. (10-20)
* threshold output to indicate how many bases in each region are
covered at the given thresholds.
when appropriate, the output files are bgzipped and indexed for ease
of use.
.
This package contains a test data set as well as sample scripts
running some test suite provided by Debian also as autopkgtest.
Installed Size: 1.1 MB
Architectures: all