Databases•22 min read•Intermediate
Schema Design & Normalization
1NF, 2NF, 3NF — and the practical guideline that covers 95% of real-world tables.
Why normalize?
Normalization is the process of structuring tables so that each piece of information is stored in EXACTLY ONE PLACE. The benefit: updates only need to happen in one row, and inconsistencies become impossible.
An un-normalized table
json
orders
+----+--------+--------------+----------+----------+
| id | item | customer_name | email | city |
+----+--------+--------------+----------+----------+
| 1 | Apple | Ada | a@x.com | Bristol |
| 2 | Pear | Ada | a@x.com | Bristol |
| 3 | Apple | Bob | b@x.com | London |
+----+--------+--------------+----------+----------+
-- Ada's email lives in 2 rows. Update one and forget the other → inconsistent.
-- Move Ada to a new city? Now you must update every order row.Normalized into two tables
json
customers
+----+------+---------+----------+
| id | name | email | city |
+----+------+---------+----------+
| 1 | Ada | a@x.com | Bristol |
| 2 | Bob | b@x.com | London |
+----+------+---------+----------+
orders
+----+----------+--------+
| id | cust_id | item |
+----+----------+--------+
| 1 | 1 | Apple |
| 2 | 1 | Pear |
| 3 | 2 | Apple |
+----+----------+--------+The practical rule
If you find yourself updating the same value in multiple rows, you should probably split that data into a separate table linked by a foreign key. That's 90% of normalization in one sentence.
ℹ Note
Sometimes you DENORMALIZE for performance — duplicating data to avoid expensive joins. That's a deliberate, eyes-open tradeoff: faster reads, more write complexity. Don't denormalize before you have a measured performance problem.