Double hashing
-
Linear probing collision resolution leads to clusters in the table, because if two keys collide, the next position probed will be the same for both of them.
-
The idea of double hashing: Make the offset to the next position probed depend on the key value, so it can be different for different keys
-
Need to introduce a second hash function H
2
(K), which is used as the offset in the probe sequence (think of linear probing as double hashing with H
2
(K) == 1)
-
For a hash table of size M, H
2
(K) should have values in the range 1 through M-1; if M is prime, one common choice is H2(K) = 1 + ( (K/M) mod (M-1) )
-
The insert algorithm for double hashing is then:
1. Set indx = H(K); offset = H
2
(K)
2. If table location indx already contains the key, no need to insert it. Done!
3. Else if table location indx is empty, insert key there. Done!
4. Else collision. Set indx = (indx + offset) mod M.
5. If indx == H(K), table is full! (Throw an exception, or enlarge table.) Else go to 2.
-
With prime table size, double hashing works very well in practice
CONTENTS PREVIOUS NEXT