Data Structures & Algorithms28 min readExpert

Heaps, Tries & Union-Find

The toolkit that separates intermediate from expert: priority queues, prefix trees, and disjoint-set union.

🏥
Real-life analogy — A heap = an ER triage queue
Patients arrive in random order, but the next one called isn't the longest-waiting — it's the most critical. The triage nurse keeps the most urgent case at the 'top' of the queue, regardless of arrival order. That's a priority queue.

Heaps (priority queues)

A heap is a tree-shaped data structure where every parent is less than or equal to its children (min-heap) — or greater (max-heap). The smallest (or largest) element is always at the root, available in O(1). Insert and remove-min are O(log n).

Heaps are how you implement a priority queue: a queue where the highest-priority item comes out first, regardless of insertion order. They power Dijkstra's shortest-path algorithm, A*, top-K queries, scheduling, and more.

import heapq

# Top-K largest with a min-heap of size k
def top_k(nums, k):
    h = []
    for n in nums:
        heapq.heappush(h, n)
        if len(h) > k:
            heapq.heappop(h)
    return sorted(h, reverse=True)

print(top_k([3, 1, 4, 1, 5, 9, 2, 6], 3))   # [9, 6, 5]

Why a min-heap of size k? Once it's full, the smallest of the candidates is at the root. Each new element either goes into the heap (and the new minimum is dropped) or is dropped immediately. O(n log k) — way better than O(n log n) sorting when k is small.

Dijkstra's shortest path (sketch)

import heapq

def dijkstra(graph, start):
    # graph: dict[node] = list of (neighbor, weight)
    dist = {start: 0}
    pq = [(0, start)]
    while pq:
        d, u = heapq.heappop(pq)
        if d > dist.get(u, float("inf")): continue
        for v, w in graph[u]:
            nd = d + w
            if nd < dist.get(v, float("inf")):
                dist[v] = nd
                heapq.heappush(pq, (nd, v))
    return dist

Tries (prefix trees)

A trie is a tree where each path from the root spells out a string. Insert and lookup are O(L), where L is the length of the string — independent of the number of stored strings. Tries shine for autocomplete, spell checking, longest-prefix matching, and IP routing.

class Trie:
    def __init__(self):
        self.children = {}
        self.end = False

    def insert(self, word):
        node = self
        for ch in word:
            node = node.children.setdefault(ch, Trie())
        node.end = True

    def search(self, word):
        node = self
        for ch in word:
            if ch not in node.children: return False
            node = node.children[ch]
        return node.end

    def starts_with(self, prefix):
        node = self
        for ch in prefix:
            if ch not in node.children: return False
            node = node.children[ch]
        return True

t = Trie()
for w in ["apple", "app", "apt"]:
    t.insert(w)
print(t.search("app"), t.starts_with("ap"))   # True True

Union-Find (Disjoint Set Union, DSU)

Union-Find tracks a partition of n elements into disjoint sets, with two operations: `find(x)` (which set is x in?) and `union(a, b)` (merge a's set with b's set). With path compression and union by rank, both run in nearly O(1) amortized — formally, O(α(n)), the inverse Ackermann function, which is at most 4 for any n in the universe.

Used for: connected components in a graph, Kruskal's MST algorithm, detecting cycles in undirected graphs, network connectivity over time, and many tile/grid problems.

class DSU:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        # Path compression: flatten as we go
        while self.parent[x] != x:
            self.parent[x] = self.parent[self.parent[x]]
            x = self.parent[x]
        return x

    def union(self, a, b):
        ra, rb = self.find(a), self.find(b)
        if ra == rb: return False
        # Union by rank — attach shorter tree under taller
        if self.rank[ra] < self.rank[rb]: ra, rb = rb, ra
        self.parent[rb] = ra
        if self.rank[ra] == self.rank[rb]:
            self.rank[ra] += 1
        return True

# Example: count connected components after a stream of edges
dsu = DSU(5)
for a, b in [(0, 1), (1, 2), (3, 4)]:
    dsu.union(a, b)
roots = {dsu.find(i) for i in range(5)}
print(len(roots))   # 2
💡 Tip
These three structures (heap, trie, union-find) cover most \"hard\" interview problems. Once you can recognize when to use each, problems that looked impossible will look easy.