Normalization in Database Management System

Ritik Goel
6 min readFeb 21, 2021

What is Normalization?

Normalization is a database design technique that reduces data redundancy and eliminates undesirable characteristics like Insertion, Update, and Deletion Anomalies. Normalization rules divide larger tables into smaller tables and link them using relationships.
Without Normalization, we may face many issues such as

  1. Insertion anomaly: It occurs when we cannot insert data into the table without the presence of another attribute.
  2. Update anomaly: It is a data inconsistency that results from data redundancy and a partial update of data.
  3. Deletion Anomaly: It occurs when certain attributes are lost because of the deletion of other attributes.

Database Normal Forms

Normalization will enhance the distribution of data. Now let’s understand each and every Normal Form with examples.

Types of Normal Forms

1st Normal Form (1NF)

In this Normal Form, we tackle the problem of atomicity. Here atomicity means values in the table should not be further divided. In simple terms,

  • Each table cell should contain a single value.
  • Each record needs to be unique.

If a table contains a composite or multi-valued attribute, it violates the First Normal Form.

In the above table, the Phone Number column contains multiple values so it violates the rule of 1NF. Now if we apply the 1st NF to the above table we get the below table as the result.

Now the above table satisfies the rules to be in 1NF.

2nd Normal Form (2NF)

A relational table is considered in the second normal form if it is in the first normal form and that every non-key column is fully dependent upon the primary key means the table should not contain partial dependency.
To understand in a better way let's look at the below example.

This table has a composite primary key Employee Id and Department Id. The non-key attribute is Office Location. In this case, Office Location only depends on Department Id, which is only part of the primary key. Therefore, this table does not satisfy the second Normal Form.

To bring this table to Second Normal Form, we need to break the table into two parts. This will give us the below tables:

We have eliminated the partial functional dependence that we initially had, as you can see. Now, the Office Location column in the table is fully dependent on the table’s primary key, which is the Department ID.
Now the above tables satisfy the rules to be in 2NF.

3rd Normal Form (3NF)

A relational table is considered in the third normal form if all columns in the table are dependent only upon the primary key.

  • Rule 1- Be in 2NF
  • Rule 2- Has no transitive functional dependencies.

What are transitive functional dependencies?

A transitive functional dependency is when changing a non-key column, which might cause any of the other non-key columns to change.

Let’s understand this more clearly with the help of an example:

In the above table, Student ID determines Subject ID, and Subject ID determines Subject. Therefore, Student ID determines Subject via Subject ID. This implies that we have a transitive functional dependency, and this structure does not satisfy the third normal form.
Now in order to achieve the third normal form, we need to divide the table as shown below:

As you can see from the above tables all the non-key attributes are now fully functional dependent only on the primary key. In the first table, columns Student Name, the Subject ID, and Address are only dependent on Student ID. In the second table, the Subject is only dependent on the Subject ID.
Now the above tables satisfy the rules to be in 3NF.

Boyce Codd Normal Form (BCNF)

This is also known as 3.5 NF. This is a more robust version of 3NF that occurs only under specific circumstances. There must be multiple candidate keys, one of the keys must be composite, and the candidate keys must overlap. In order to normalize the relation, the developer must pick a determinant on which one column is fully functionally dependent. Then he must create a second relation so that every determinant is a candidate key.

Before proceeding to BCNF the table has to satisfy the 3rd Normal Form.

In BCNF if every functional dependency A → B, then A has to be the Super Key of that particular table.

  • One student can enroll for multiple subjects.
  • There can be multiple professors teaching one subject
  • And, For each subject, a professor is assigned to the student

In this table, all the normal forms are satisfied except BCNF. Why?

As you can see Student Id and Subject form the primary key, which means the Subject column is a prime attribute. But, there is one more dependency, Professor → Subject.

And while Subject is a prime attribute, Professor is a non-prime attribute, which is not allowed by BCNF.

Now in order to satisfy the BCNF, we will be dividing the table into two parts. One table will hold Student Id which already exists and a newly created column Professor Id.

And in the second table, we will have the columns Professor Id, Professor and Subject.

By doing this we are satisfied with the Boyce Codd Normal Form.

Fourth Normal Form (4NF)

Issues with 4NF occur when a relationship is in BCNF and all multi-valued dependencies are also functional dependencies. For example, “Suppose that employees can be assigned to multiple projects. Also, suppose that employees can have multiple job skills. If we record this information in a single table, all three attributes must be used as the key since no single attribute can uniquely identify an instance” (UTA). This would create anomalies that would fit under update, insert, and delete functions. In order to solve this problem, simply move each multi-valued dependency to its own table.

Fifth Normal Form (5NF)

NF addresses project-join dependency issues within tables that have been decomposed. During the normalization process, large tables are broken into smaller cohesive tables. Usually, a table in 4NF is considered to be in 5NF, but sometimes abstract insertion anomalies may hinder the join process of a decomposed table. If a table cannot be decomposed further, it is said to be in 5NF.

Conclusion

Database designing is critical to the successful implementation of a database management system that meets the data requirements of an enterprise system. The DBMS Normalization Process helps to generate cost-effective and better security models for database systems. Functional dependencies are a very critical component of the normalized data process. Most database systems are normalized databases up to the third normal form.

--

--