hashtable-benchmarks

An Evaluation of Linear Probing Hashtable Algorithms (by senderista)

Hashtable-benchmarks Alternatives

Similar projects and alternatives to hashtable-benchmarks

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better hashtable-benchmarks alternative or higher similarity.

hashtable-benchmarks reviews and mentions

Posts with mentions or reviews of hashtable-benchmarks. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-20.
  • Building a faster hash table for high performance SQL joins
    3 projects | news.ycombinator.com | 20 Dec 2023
    Since the blog post mentioned a PR to replace linear probing with Robin Hood, I just wanted to mention that I found bidirectional linear probing to outperform Robin Hood across the board in my Java integer set benchmarks:

    https://github.com/senderista/hashtable-benchmarks/blob/mast...

    https://github.com/senderista/hashtable-benchmarks/wiki/64-b...

  • Ask HN: Who wants to be hired? (December 2023)
    26 projects | news.ycombinator.com | 1 Dec 2023
    https://homes.cs.washington.edu/~magda/papers/wang-cidr17.pd...

    I'm most interested in developing high-performance database engines in low-level languages, but open to any challenging systems programming project. I've been working in C++ for the last 3 years, but have written nontrivial projects in Rust and Java as well (e.g., https://github.com/senderista/rotated-array-set, https://github.com/senderista/hashtable-benchmarks). I would enjoy using Rust or Zig on a new project, but I consider the project itself to be much more important than the language it's written in. I am not interested in cryptocurrency, adtech, or fintech projects.

  • Factor is faster than Zig
    11 projects | news.ycombinator.com | 10 Nov 2023
    Thanks for the details on your benchmarks. I would like sometime to extend BLP to a more generic setting; as I said I think any trick used with RH would also work with BLP. I just used an integer set because that's all I needed for my use case and it was easy to implement several different approaches for benchmarking. As you note, it favors use cases where the hash function is cheap (or invertible) and elements are cheap to move around.

    About your question on load factors: no, the benchmarks are measuring exactly what they claim to be. The hash table constructor divides max data size by load factor to get the table size (https://github.com/senderista/hashtable-benchmarks/blob/mast...), and the benchmark code instantiates each hash table for exactly the measured data set size and load factor (https://github.com/senderista/hashtable-benchmarks/blob/mast...).

    I can't explain the peaks around 1M in many of the plots; I didn't investigate them at the time and I don't have time now. It could be a JVM artifact, but I did try to use JMH "best practices", and there's no dynamic memory allocation or GC happening during the benchmark at all. It would be interesting to port these tables to Rust and repeat the measurements with Criterion. For more informative graphs I might try a log-linear approach: divide the intervals between the logarithmically spaced data sizes into a fixed number of subintervals (say 4).

  • Inside boost::unordered_flat_map
    11 projects | /r/cpp | 18 Nov 2022
    I think "bidirectional linear probing" is an underrated approach (and much simpler): https://github.com/senderista/hashtable-benchmarks/blob/master/src/main/java/set/int64/BLPLongHashSet.java
  • A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion
    5 projects | /r/cpp | 4 Jul 2022
    I will probably never get around to porting my bidirectional linear probing integer hash set from Java to C++, but I hope someone can try adapting BLP to general C++ hashmaps and hashsets, because it significantly outperforms Robin Hood in my benchmarks.
  • Ask HN: Who wants to be hired? (March 2022)
    14 projects | news.ycombinator.com | 1 Mar 2022
    https://homes.cs.washington.edu/~magda/papers/wang-cidr17.pd...

    I'm most interested in developing high-performance database engines in low-level languages, but open to any challenging systems programming project. I've been working in C++ for the last 2 years, but have written nontrivial projects in Rust and Java as well (e.g., https://github.com/senderista/rotated-array-set, https://github.com/senderista/hashtable-benchmarks). I would enjoy using Rust or Zig on a new project, but I consider the project itself to be much more important than the language it's written in. I am not interested in cryptocurrency, adtech, or fintech projects.

  • A note from our sponsor - InfluxDB
    www.influxdata.com | 3 Jun 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Stats

Basic hashtable-benchmarks repo stats
8
29
4.7
6 months ago

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com