Ordered hashtable

1. Hashtable

A hashtable is a solution for associating keys with values. Its success comes from its speed.

To work, a hashtable computes a hashcode for each key. This hashcode is used to designate the location where the key-value pair is stored in the table. To disperse the data inside the table, the hashcode is a quasi-random number that varies enormously from one key to another. The location of a key-value pair:

corresponds to the hashcode,
does not match the keys.

By traversing the successive positions of the table, one finds all the data but in the random order of the hashcode. A hashtable can not restore the order of keys.

2. Index64

Index64 places the data in a structure equivalent to a binary tree. Instead of computing a hashcode, the Index64 algorithm determines the exact location of each new key-value pair in the binary tree. Browsing the internal structure makes it possible to restore all the data in the order of the keys!

Moreover, whatever the modifications carried out by additions or deletions, the structure used is permanently balanced. From this point of view the internal structure of Index64 is slightly different from a binary tree.

However, Index64 is as fast as a hashtable. See benchmark of Embedded Index64 → …

3. Benefits of data sorting on the fly

Ordered keys are especially useful when:

keys represent time
keys represent amounts

Some operations become possible:

Sorts are available without further treatment.
Intervals are obtained without enforcing a specific structure like a sorted set.

Many operations become easier and faster:

Intervals are obtained in O(log(N) + n). N is the table size. n is the interval size.
The set operations (union, intersection and difference) are easier.
Joins are easier, allowing more data connections always in order.

Conclusion

Index64 succeeds the feat of being as fast as a hashtable while keeping keys order. This is true for both the embedded version and the key value store version of Index64.

	Index64	Hashtable
Keys order	YES	NO
Performances	Same performances
Balanced tree	Always balanced tree	NA

1. Hashtable

2. Index64

3. Benefits of data sorting on the fly

Conclusion

BROWSE BY

CATEGORIES