Hashing Basics
Learn how hash functions convert keys into array indices for fast lookups.
A hash function takes an input (key) and returns a fixed-size integer called a hash code. This hash code is then used to determine where to store the data in an array.
key → hashCode() → bucket index
Good hash functions distribute keys evenly across buckets, minimizing collisions.
In Java, every object has a hashCode() method. For Strings, it's calculated using character values:
The multiplier 31 is chosen because it's an odd prime, which helps distribute hash codes more evenly.
Try these examples:
Hash Code to Index
The bucket index is calculated using the modulo operation:
This ensures the index is always within [0, capacity-1].
Collisions
When two different keys hash to the same index, we have a collision.
Java's HashMap handles collisions using chaining — storing multiple entries in the same bucket as a linked list.
Hash codes can be very large numbers (up to 2^31 - 1 in Java). We can't create an array that large! The modulo operation maps any hash code to a valid array index.
Different capacities will give different indices for the same key. This is why rehashing (changing capacity) requires recalculating all indices.