- libc6 (>= 2.34)
- libfmt9 (>= 9.1.0+ds1)
- libgcc-s1 (>= 3.0)
- libgomp1 (>= 4.9)
- libopenmpi3 (>= 4.1.4)
- libprotobuf32 (>= 3.21.12)
- libstdc++6 (>= 11)
- libuuid1 (>= 2.16)
- libgenomicsdb0 (= 1.4.4-3)
GenomicsDB is built on top of a htslib fork and an internal array storage
system for importing, querying and transforming variant data. Variant data is
sparse by nature (sparse relative to the whole genome) and using sparse array
data stores is a perfect fit for storing such data.
.
The GenomicsDB stores variant data in a 2D array where:
- Each column corresponds to a genomic position (chromosome + position);
- Each row corresponds to a sample in a VCF (or CallSet in the GA4GH
terminology);
- Each cell contains data for a given sample/CallSet at a given position;
data is stored in the form of cell attributes;
- Cells are stored in column major order - this makes accessing cells with
the same column index (i.e. data for a given genomic position over all
samples) fast.
- Variant interval/gVCF interval data is stored in a cell at the start of the
interval. The END is stored as a cell attribute. For variant intervals
(such as deletions and gVCF REF blocks), an additional cell is stored at
the END value of the variant interval. When queried for a given genomic
position, the query library performs an efficient sweep to determine all
intervals that intersect with the queried position.
.
This package contains some tools to be run as executable files.
Installed Size: 12.2 MB
Architectures: amd64