Hash Maps & Sets
How hash tables work — and why they turn O(n²) problems into O(n) ones.
What is a hash map?
A hash map (hash table, dict, dictionary, associative array) stores key-value pairs and provides expected O(1) insert, lookup, and delete. The trick: a hash function turns each key into an integer, which becomes an index into a backing array.
Under the hood
- Backing array of buckets, often ~75% full.
- Insert(k, v): compute hash(k), mod by array size to get a bucket, place (k, v) there.
- Two keys hash to the same bucket → COLLISION. Resolved by chaining (linked list per bucket) or open addressing (probe to next empty slot).
- When the array gets too full, allocate a bigger one and rehash all entries — amortized O(1).
When the hash function distributes keys evenly, lookups are O(1). When it doesn't (or an adversary picks pathological keys), they degrade to O(n) — every key in one bucket. Modern languages use cryptographically randomized hashes to prevent this.
The classic pattern: turn O(n²) into O(n)
# O(n^2) brute force
def two_sum_brute(nums, target):
for i in range(len(nums)):
for j in range(i + 1, len(nums)):
if nums[i] + nums[j] == target:
return [i, j]
# O(n) with a hash map
def two_sum(nums, target):
seen = {} # value -> index
for i, n in enumerate(nums):
complement = target - n
if complement in seen:
return [seen[complement], i]
seen[n] = i
return []
print(two_sum([2, 7, 11, 15], 9)) # [0, 1]The pattern: as you scan, store what you've seen; for each new element, check if its complement is already there. One pass, O(n) extra space.
Sets
A set is a hash map without the values — only keys. Use it whenever the question is "have I seen this?" or for set arithmetic.
# Detect duplicates in a list
def has_duplicate(nums):
return len(set(nums)) != len(nums)
# First non-repeating character
from collections import Counter
def first_unique(s):
counts = Counter(s)
for ch in s:
if counts[ch] == 1:
return ch
return None
print(first_unique("leetcode")) # \"l\"When NOT to use a hash map
- When you need ORDERED iteration by key — use a tree-based map (Java's TreeMap, C++'s std::map).
- When keys are small integers in a tight range — a plain array indexed by key is faster.
- When you need to iterate in INSERTION order — Python dicts and JS Maps are fine for this; standard sets are not always.