A Bloom filter is a representation of a set of n items, where the main
requirement is to make membership queries; i.e., whether an item is a
member of a set.
.
A Bloom filter has two parameters: m, a maximum size (typically a
reasonably large multiple of the cardinality of the set to represent)
and k, the number of hashing functions on elements of the set. (The
actual hashing functions are important, too, but this is not a
parameter for this implementation). A Bloom filter is backed by a BitSet
(https://github.com/willf/bitset); a key is represented in the filter
by setting the bits at each value of the hashing functions (modulo
m). Set membership is done by testing whether the bits at each value of
the hashing functions (again, modulo m) are set. If so, the item is in
the set. If the item is actually in the set, a Bloom filter will never
fail (the true positive rate is 1.0); but it is susceptible to false
positives. The art is to choose k and m correctly.
.
In this implementation, the hashing functions used is murmurhash
(github.com/spaolacci/murmur3), a non-cryptographic hashing function.
Installed Size: 59.4 kB
Architectures: all