Normalization is a process in database design that helps organize data efficiently, reduce redundancy, and ensure data integrity. It involves applying a set of rules called normal forms to structure the database schema. Let’s explore the key concepts and principles of normalization and database design.
Table of Contents
ToggleFirst Normal Form (1NF)
First Normal Form eliminates repeating groups and ensures that each attribute contains only atomic (indivisible) values. It eliminates repeating groups and ensures that each attribute contains only one value.
-- Before normalization (not in 1NF)
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
order1_product VARCHAR(50),
order1_quantity INT,
order2_product VARCHAR(50),
order2_quantity INT
);
-- After normalization (in 1NF)
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100)
);
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
product VARCHAR(50),
quantity INT,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
As seen in example above, after 1NF normalization code is much cleaner and does not include unnecessary repetitions.
Second Normal Form (2NF)
Second Normal Form eliminates partial dependencies by ensuring that each non-key column depends on the entire primary key.
-- Before normalization (not in 2NF)
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
customer_name VARCHAR(100),
product VARCHAR(50),
quantity INT
);
-- After normalization (in 2NF)
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100)
);
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
product VARCHAR(50),
quantity INT,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
After 2NF normalization, table “orders” does not include unnecessary “customer_name” column, but refers to it in the “customers” table using foreign key.
Third Normal Form (3NF)
Third Normal Form eliminates transitive dependencies by ensuring that non-key attributes depend only on the primary key or other non-key attributes.
-- Before normalization (not in 3NF)
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
employee_name VARCHAR(100),
department_name VARCHAR(100),
department_location VARCHAR(100)
);
-- After normalization (in 3NF)
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
employee_name VARCHAR(100),
department_id INT,
FOREIGN KEY (department_id) REFERENCES departments(department_id)
);
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(100),
department_location VARCHAR(100)
);
Normalization beyond 3NF
Beyond 3NF, additional normal forms like Boyce-Codd Normal Form (BCNF), Fourth Normal Form (4NF), and Fifth Normal Form (5NF) address more complex dependency issues. These forms focus on further eliminating redundancy and ensuring data consistency.
It’s important to note that normalization is an iterative process. As the database evolves and requirements change, you may need to revisit and adjust the schema accordingly.
By following normalization principles, you can design a database that is efficient, flexible, and maintains data integrity. Normalization helps optimize data storage, reduce redundancy, and improve query performance. It’s a fundamental concept in database design and is essential for building robust and scalable database systems.