sourmash - 4.8.14-3
main
Compute MinHash signatures for nucleotide (DNA/RNA) and protein sequences.
.
MinHash sketches provide a lightweight way to store “signatures” of large DNA
or RNA sequence collections, and then compare or search them using a Jaccard
index. MinHash sketches can be used to identify samples, find similar samples,
identify data sets with shared sequences, and build phylogenetic trees
(Ondov et al. 2015).
.
sourmash provides a command line script, a Python library, and a CPython
module for MinHash sketches.
.
These tools provide functionality previously handled by the 'khmer' package.