In creating a database, normalization is the process of organizing it into tables in such a way that the results of using the database are always unambiguous and as intended. Normalization may have the effect of duplicating data within the database and often results in the creation of additional tables. (While normalization tends to increase the duplication of data, it does not introduce redundancy, which is unnecessary duplication.)

Normalization is typically a refinement process after the initial exercise of identifying the data objects that should be in the database, identifying their relationships, and defining the tables required and the columns within each table.

A simple example of normalizing data might consist of a table showing:


----------------------------------------------
Customer  |  Item Purchased  |  Purchase Price
----------+------------------+----------------
Thomas    |  Shirt           |            $40
Maria     |  Tennis shoes    |            $35
Evelyn    |  Shirt           |            $40 
Pajaro    |  Trousers        |            $25 

If this table is used for the purpose of keeping track of the price of items and you want to delete one of the customers, you will also delete a price. Normalizing the data would mean understanding this and solving the problem by dividing this table into two tables, one with information about each customer and a product they bought and the second about each product and its price. Making additions or deletions to either table would not affect the other.

Normalization degrees of relational database tables have been defined and include:

  • First normal form (1NF). This is the "basic" level of normalization and generally corresponds to the definition of any database, namely: It contains two-dimensional tables with rows and columns. Each column corresponds to a sub-object or an attribute of the object represented by the entire table. Each row represents a unique instance of that sub-object or attribute and must be different in some way from any other row (that is, no duplicate rows are possible). All entries in any column must be of the same kind. For example, in the column labeled "Customer," only customer names or numbers are permitted.
  • Second normal form (2NF). At this level of normalization, each column in a table that is not a determiner of the contents of another column must itself be a function of the other columns in the table. For example, in a table with three columns containing customer ID, product sold, and price of the product when sold, the price would be a function of the customer ID (entitled to a discount) and the specific product.
  • Third normal form (3NF). At the second normal form, modifications are still possible because a change to one row in a table may affect data that refers to this information from another table. For example, using the customer table just cited, removing a row describing a customer purchase (because of a return perhaps) will also remove the fact that the product has a certain price. In the third normal form, these tables would be divided into two tables so that product pricing would be tracked separately.
  • Domain/key normal form (DKNF). A key uniquely identifies each row in a table. A domain is the set of permissible values for an attribute. By enforcing key and domain restrictions, the database is assured of being freed from modification anomalies. DKNF is the normalization level that most designers aim to achieve.

In mathematics:

Normalization is the process of taking an element of a normed linear space giving it a specific norm. Usually one chooses to make the norm equal to one, so, without loss of generality, that will be the case discussed here. In an inner product space, this can be restated as choosing a vector which is parallel to the first one but has a length of 1. In principle this process is quite simple. A fundamental property of the normed linear space is that given an element v and scalar s and denoting the norm of an element w as ||w||:

||s w|| = |s| ||w||

Thus, to normalize v to a new element n, we simply have

n = v/||v||

The constant 1/||v|| is often referred to as the normalization constant.

Ok, so the next question is why would you want to do that? Well, when working with inner product spaces, normalizing the vector gives you a new vector that basically indicates only the direction, so that the inner product of another vector w with n gives the projection of w onto n (the component of w in the n direction).

Normalization can also be important in infinite dimensional vector spaces. The set of integrable functions on an interval (the classical Banach space L1) is one example. In this case, the norm of an element f is defined as ||f|| = |f(x)| dx over the interval of the space. If |f(x)| is supposed to represent the probability distribution for a random variable then normalization becomes important, because the integral of |f(x)| over the whole interval must be equal to one, signifying that one of the possible outcomes must occur. In the case of quantum mechanics, the probability distribution for the outcome of a measurement comes from another function, the wavefunction of the quantum state. According to the Born statistical interpretation, the probability distribution P(x) = |ψ(x)|2. So, the requirement that the probabilities add up to one then implies that ψ must be normalized but now using the norm ||f||2 = |ψ(x)|2 dx. This means that wavefunctions are elements of a different normed space, the classical Banach space L2. Such functions are often said to be square integrable.

In many cases normalizing an element of a normed linear space is fairly easy; however, in the case of the spaces of functions, calculating the norm of an element can become quite tricky, especially if the interval is infinite in extent.

I should also note that sometimes if you're dealing with some normed quantity that evolves in time, that time evolution will cause it's norm to change. After you get the evolved quantity, you might normalize it again. When they do this, people sometimes say they have "renormalized" the quantity. In quantum field theory there is also a procedure called renormalization, but it is entirely different, having to do with certain infinite contributions to the energy of the system that must be removed mathematically in order to get sensible results.

In politics, the word normalization is often abused as an excuse for taking "extraordinary measures" to "bring things back to normal." Alas, the term "normal" depends on who is hailing the normalization process.

Dictators, both on the left and the right end of the spectrum, have overused normalization to justify murder and genocide, taking of political prisoners, destruction of economy, imposition or prohibition of religious beliefs and rituals, redefinition of morality, etc.

I have personally experienced this kind of "normalization" after the Soviet invasion of Czechoslovakia on August 21, 1968. Before that, Czechoslovakia was implementing socialism with a human face. After that, the word "normalization" was officially used to crush any semblance of human face in the Soviet model of socialism. Since that model had been forced on Czechoslovakia 23 years earlier in Yalta, it was now "simply the matter of returning back to normal." It took another 21 years to get rid of that "normalization."

In a recording studio:

There are two kinds, or definitions, of normalization.

The first is in relation to the mixing console, or desk. Normalization is the process of returning all the knobs and faders to their "at rest" position. For most studios, this means bringing everything to "zero" (whether it be "hard left," "12 o'clock," or any place else), or, if zero doesn't exist for that item (such as a frequency selector on an in-board EQ), then 12 o'clock is the typical position. The exact state of a normalized board, however, varies slightly from studio to studio. One studio, for example, might see nothing wrong with leaving the aux sends up.

Normalization is also a digital process which analyzes a particular digital music file, and raises its gain so that the loudest peak of the file will happen at digital zero, the loudest possible volume before the onset of distortion. This is often done during mastering.

Nor`mal*i*za"tion (?), n.

Reduction to a standard or normal state.

 

© Webster 1913.

Log in or register to write something here or to contact authors.