Database Concepts – Normalization and Normal Forms
In the second post on Database Concepts, we will quickly go over the concept of Normalization and normal forms.
Normalization is the process of organizing data to remove redundancy and to create well-structured and optimized relations or tables. It is usually done by breaking down large tables into smaller ones and defining relationships between them.
E.F.Codd introduced the 1NF, 2NF and 3NF. This was followed by the Boyce-Codd Normal Form (BCNF) and higher normal forms until the 6NF. However, a database does not necessarily need to go all the way to 6 NF to be normalized. If a database satisfies the 3NF, it is considered a normalized database. In fact, in some cases it may actually be beneficial not to go all the way to 6NF, especially depending on the usage of the database for queries.
The first normal form requires that the table have no repeating groups. All attributes in the table must be atomic. Any repeating data should be moved to a secondary table with relationship defined to the first one. The implication of this requirement is that there is no top-down ordering of rows in the table (so you can move rows around without any implications), no left-to-right ordering of columns (so you can change the column ordering without any implications) in a table, no duplicate rows (redundant data is not allowed) and each row-column has exactly one value.
The second normal form requires that the table is in 1NF and additionally all non-key attributes are fully dependent on the whole of the table’s candidate keys. No non-key attribute can be dependent on just a part of the candidate keys.
The third normal form requires that the table is in 2NF and additionally all non-key attributes should be fully dependent on the primary key. Any columns that are not dependent on the primary key should be removed into a separate table.
In part II of this post, we will look at an example of how to normalize a database