Heaps, Tries & Union-Find
The toolkit that separates intermediate from expert: priority queues, prefix trees, and disjoint-set union.
Heaps (priority queues)
A heap is a tree-shaped data structure where every parent is less than or equal to its children (min-heap) — or greater (max-heap). The smallest (or largest) element is always at the root, available in O(1). Insert and remove-min are O(log n).
Heaps are how you implement a priority queue: a queue where the highest-priority item comes out first, regardless of insertion order. They power Dijkstra's shortest-path algorithm, A*, top-K queries, scheduling, and more.
import heapq
# Top-K largest with a min-heap of size k
def top_k(nums, k):
h = []
for n in nums:
heapq.heappush(h, n)
if len(h) > k:
heapq.heappop(h)
return sorted(h, reverse=True)
print(top_k([3, 1, 4, 1, 5, 9, 2, 6], 3)) # [9, 6, 5]Why a min-heap of size k? Once it's full, the smallest of the candidates is at the root. Each new element either goes into the heap (and the new minimum is dropped) or is dropped immediately. O(n log k) — way better than O(n log n) sorting when k is small.
Dijkstra's shortest path (sketch)
import heapq
def dijkstra(graph, start):
# graph: dict[node] = list of (neighbor, weight)
dist = {start: 0}
pq = [(0, start)]
while pq:
d, u = heapq.heappop(pq)
if d > dist.get(u, float("inf")): continue
for v, w in graph[u]:
nd = d + w
if nd < dist.get(v, float("inf")):
dist[v] = nd
heapq.heappush(pq, (nd, v))
return distTries (prefix trees)
A trie is a tree where each path from the root spells out a string. Insert and lookup are O(L), where L is the length of the string — independent of the number of stored strings. Tries shine for autocomplete, spell checking, longest-prefix matching, and IP routing.
class Trie:
def __init__(self):
self.children = {}
self.end = False
def insert(self, word):
node = self
for ch in word:
node = node.children.setdefault(ch, Trie())
node.end = True
def search(self, word):
node = self
for ch in word:
if ch not in node.children: return False
node = node.children[ch]
return node.end
def starts_with(self, prefix):
node = self
for ch in prefix:
if ch not in node.children: return False
node = node.children[ch]
return True
t = Trie()
for w in ["apple", "app", "apt"]:
t.insert(w)
print(t.search("app"), t.starts_with("ap")) # True TrueUnion-Find (Disjoint Set Union, DSU)
Union-Find tracks a partition of n elements into disjoint sets, with two operations: `find(x)` (which set is x in?) and `union(a, b)` (merge a's set with b's set). With path compression and union by rank, both run in nearly O(1) amortized — formally, O(α(n)), the inverse Ackermann function, which is at most 4 for any n in the universe.
Used for: connected components in a graph, Kruskal's MST algorithm, detecting cycles in undirected graphs, network connectivity over time, and many tile/grid problems.
class DSU:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
# Path compression: flatten as we go
while self.parent[x] != x:
self.parent[x] = self.parent[self.parent[x]]
x = self.parent[x]
return x
def union(self, a, b):
ra, rb = self.find(a), self.find(b)
if ra == rb: return False
# Union by rank — attach shorter tree under taller
if self.rank[ra] < self.rank[rb]: ra, rb = rb, ra
self.parent[rb] = ra
if self.rank[ra] == self.rank[rb]:
self.rank[ra] += 1
return True
# Example: count connected components after a stream of edges
dsu = DSU(5)
for a, b in [(0, 1), (1, 2), (3, 4)]:
dsu.union(a, b)
roots = {dsu.find(i) for i in range(5)}
print(len(roots)) # 2