Lecture 16

Hashing
Hash table and hash function design
Hash functions for integers and strings
Collision resolution strategies: linear probing, double hashing, random hashing, separate chaining
Hash table cost functions

Reading: Weiss Ch. 5

Finding data fast

Hashing

Probability of collisions

Average total number of collisions

Hashtable collisions and the "birthday paradox"

The birthday collision "paradox"

Making hashing work

Hash table size

Hash functions: desiderata

Hash functions for integers: H(K) = K mod M

Hash functions for integers: random functions

Hash functions for strings

String hash function #1

String hash function #2

String hash function #2: Java code

String hash function #3

Hash functions in other contexts

Using a hash function

Collision resolution strategies

Linear probing: inserting a key

Linear probing, an example

Linear probing: searching for a key

Double hashing

Random hashing

Open addressing vs. separate chaining

Analysis of open-addressing hashing

Average case unsuccessful find / insertion cost

Average case successful find cost

Separate chaining: basic algorithms

Separate chaining, an example

M = 7, H(K) = K mod M
insert these keys 701, 145, 217, 19, 13, 749
in this table, using separate chaining:

Analysis of separate-chaining hashing

Average case costs with separate chaining

Dictionary data types

Dictionary as ADT

Implementing the Dictionary ADT

Hashtables vs. balanced search trees

Hashtables vs. balanced search trees, cont’d

Next time