
PTHash: Revisiting FCH Minimal Perfect Hashing
Given a set S of n distinct keys, a function f that bijectively maps the...
read it

RecSplit: Minimal Perfect Hashing via Recursive Splitting
A minimal perfect hash function bijectively maps a key set S out of a un...
read it

Constructing Minimal Perfect Hash Functions Using SAT Technology
Minimal perfect hash functions (MPHFs) are used to provide efficient acc...
read it

PropertyPreserving Hash Functions from Standard Assumptions
Propertypreserving hash functions allow for compressing long inputs x_0...
read it

Evaluation of a Simple, Scalable, Parallel BestFirst Search Strategy
Largescale, parallel clusters composed of commodity processors are incr...
read it

Practical Hashbased Anonymity for MAC Addresses
Given that a MAC address can uniquely identify a person or a vehicle, co...
read it

Efficient algorithms for collecting the statistics of largescale IP address data
Compiling the statistics of largescale IP address data is an essential ...
read it
Parallel and ExternalMemory Construction of Minimal Perfect Hash Functions with PTHash
A minimal perfect hash function f for a set S of n keys is a bijective function of the form f : S →{0,…,n1}. These functions are important for many practical applications in computing, such as search engines, computer networks, and databases. Several algorithms have been proposed to build minimal perfect hash functions that: scale well to large sets, retain fast evaluation time, and take very little space, e.g., 2  3 bits/key. PTHash is one such algorithm, achieving very fast evaluation in compressed space, typically several times faster than other techniques. In this work, we propose a new construction algorithm for PTHash enabling: (1) multithreading, to either build functions more quickly or more spaceefficiently, and (2) externalmemory processing to scale to inputs much larger than the available internal memory. Only few other algorithms in the literature share these features, despite of their big practical impact. We conduct an extensive experimental assessment on large realworld string collections and show that, with respect to other techniques, PTHash is competitive in construction time and space consumption, but retains 2  6× better lookup time.
READ FULL TEXT
Comments
There are no comments yet.