11

Better bloom filter

 3 years ago
source link: https://vorbrodt.blog/2019/04/05/better-bloom-filter/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Better bloom filter

Based on this implementation it supports multiple hashes for better positive hit ratio. Can be initializes with size in bits and number of hashes to perform, like this: bloom_filter bloom(128, 5);As always, complete implementation on GitHub: bloom.hpp.

#pragma once
#include <vector>
#include <random>
#include <functional>
#include <cstddef>
#include <cassert>
namespace
template<typename Key, typename Hash = std::hash<Key>>
class mixer
public:
mixer(std::size_t size, const Key& key)
: m_size(size), m_random(Hash{}(key))
assert(size > 0);
std::size_t operator()() { return m_random() % m_size; }
private:
std::size_t m_size;
std::mt19937 m_random;
template <typename Key, typename Hash = std::hash<Key>>
class bloom_filter
public:
bloom_filter(std::size_t size, std::size_t hashes)
: m_size(size), m_hashes(hashes), m_bits(size)
assert(size > 0);
assert(hashes > 0);
void add(const Key& key)
mixer<Key, Hash> m(m_size, key);
for(std::size_t i = 0; i < m_hashes; ++i)
m_bits[m()] = true;
bool contains(const Key& key) const
mixer<Key, Hash> m(m_size, key);
for(std::size_t i = 0; i < m_hashes; ++i)
if(!m_bits[m()]) return false;
return true;
private:
std::size_t m_size;
std::size_t m_hashes;
std::vector<bool> m_bits;

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK