An Empirical Evaluation of Linear Probing Algorithms

hashtable-benchmarks

An Evaluation of Linear Probing Hashtable Algorithms

This repository contains implementations, unit and property tests, and benchmark code for 4 linear probing algorithms: standard linear probing, Last-Come First-Served (LCFS), Robin Hood (RH), and Bidirectional Linear Probing (BLP). The latter was published by Knuth in 1973 but has received little attention since. It outperforms all other linear probing variants tested, including the recently popular "Robin Hood" variant.

The current implementations only accept nonzero 32- or 64-bit integer keys, with no values (I use invertible hash functions, so there's no need to separately store hash codes). Deletions are tombstone-free, so there's no need to rehash after several deletions. Dynamic resizing is not currently supported (I have a separate project on incrementally resizing open-addressed hash tables which isn't ready to publish yet). I plan to implement IntInt , LongLong , IntLong , LongInt maps at some point. Eventually I want to implement a generic Java hash table using bidirectional linear probing and mapping hash codes to offsets in an array of object references (similar to CPython's dict implementation), but it will likely be a while before I have time for this.

Additionally, I've collected a number of 32- and 64-bit invertible hash functions which may be of independent interest (I had to calculate most of the inverses myself, using the scripts https://github.com/senderista/hashtable-benchmarks/blob/master/inverse32.py and https://github.com/senderista/hashtable-benchmarks/blob/master/inverse64.py ). The implementations can be found in the https://github.com/senderista/hashtable-benchmarks/tree/master/src/main/java/hash/int32 and https://github.com/senderista/hashtable-benchmarks/tree/master/src/main/java/hash/int64 directories. (I started work on a cryptographically strong invertible hash function based on the Speck cipher as a simulation baseline, but didn't complete it, since simulation results didn't seem as practically important as performance results.)

All implementations are in Java, and are benchmarked using the JMH benchmark framework.

Javadoc (with extensive documentation of the algorithms involved) is here: https://senderista.github.io/hashtable-benchmarks/ .

Benchmark results are here:

https://github.com/senderista/hashtable-benchmarks/wiki/32-bit-benchmarks

https://github.com/senderista/hashtable-benchmarks/wiki/64-bit-benchmarks

hashtable-benchmarks

An Evaluation of Linear Probing Hashtable Algorithms

Recommend

Java At Microsoft: From OpenJDK to Azure Spring Cloud | Java at Microsoft

Announcing Visual Studio Online Public Preview | Visual Studio Blog

Take your Node back end to the next level with NestJS - LogRocket Blog

Persistence – PowerShell Profile

受Facebook Libra启发欧盟拟发行公共数字货币

求推荐 1 万左右的耳机和前端

Detecting Stop Signs and Traffic Signals: Deep Learning at Lyft Mapping

A Gentle Introduction to Markov Chain Monte Carlo for Probability

中国新增光伏装机量同比下降 45%

为什么应该推行四天工作制

About Joyk