Data Warehouse - Dimensional Vs. Normalized Approach For Storage of Data

Dimensional Vs. Normalized Approach For Storage of Data

There are two leading approaches to storing data in a data warehouse — the dimensional approach and the normalized approach.

The dimensional approach, whose supporters are referred to as “Kimballites”, believe in Ralph Kimball’s approach in which it is stated that the data warehouse should be modeled using a Dimensional Model/star schema. The normalized approach, also called the 3NF model, whose supporters are referred to as “Inmonites”, believe in Bill Inmon's approach in which it is stated that the data warehouse should be modeled using an E-R model/normalized model.

In a dimensional approach, transaction data are partitioned into "facts", which are generally numeric transaction data, and "dimensions", which are the reference information that gives context to the facts. For example, a sales transaction can be broken up into facts such as the number of products ordered and the price paid for the products, and into dimensions such as order date, customer name, product number, order ship-to and bill-to locations, and salesperson responsible for receiving the order.

A key advantage of a dimensional approach is that the data warehouse is easier for the user to understand and to use. Also, the retrieval of data from the data warehouse tends to operate very quickly. Dimensional structures are easy to understand for business users, because the structure is divided into measurements/facts and context/dimensions. Facts are related to the organization’s business processes and operational system whereas the dimensions surrounding them contain context about the measurement (Kimball, Ralph 2008).

The main disadvantages of the dimensional approach are:

  1. In order to maintain the integrity of facts and dimensions, loading the data warehouse with data from different operational systems is complicated, and
  2. It is difficult to modify the data warehouse structure if the organization adopting the dimensional approach changes the way in which it does business.

In the normalized approach, the data in the data warehouse are stored following, to a degree, database normalization rules. Tables are grouped together by subject areas that reflect general data categories (e.g., data on customers, products, finance, etc.). The normalized structure divides data into entities, which creates several tables in a relational database. When applied in large enterprises the result is dozens of tables that are linked together by a web of joins. Furthermore, each of the created entities is converted into separate physical tables when the database is implemented (Kimball, Ralph 2008). The main advantage of this approach is that it is straightforward to add information into the database. A disadvantage of this approach is that, because of the number of tables involved, it can be difficult for users both to:

  1. join data from different sources into meaningful information and then
  2. access the information without a precise understanding of the sources of data and of the data structure of the data warehouse.

It should be noted that both normalized – and dimensional models can be represented in entity-relationship diagrams as both contain joined relational tables. The difference between the two models is the degree of normalization.

These approaches are not mutually exclusive, and there are other approaches. Dimensional approaches can involve normalizing data to a degree (Kimball, Ralph 2008).

In Information-Driven Business (Wiley 2010), Robert Hillard proposes an approach to comparing the two approaches based on the information needs of the business problem. The technique shows that normalized models hold far more information than their dimensional equivalents (even when the same fields are used in both models) but this extra information comes at the cost of usability. The technique measures information quantity in terms of Information Entropy and usability in terms of the Small Worlds data transformation measure.

Read more about this topic:  Data Warehouse

Famous quotes containing the words dimensional, approach, storage and/or data:

    I don’t see black people as victims even though we are exploited. Victims are flat, one- dimensional characters, someone rolled over by a steamroller so you have a cardboard person. We are far more resilient and more rounded than that. I will go on showing there’s more to us than our being victimized. Victims are dead.
    Kristin Hunter (b. 1931)

    We have learned the simple truth, as Emerson said, that the only way to have a friend is to be one. We can gain no lasting peace if we approach it with suspicion or mistrust or with fear.
    Franklin D. Roosevelt (1882–1945)

    Many of our houses, both public and private, with their almost innumerable apartments, their huge halls and their cellars for the storage of wines and other munitions of peace, appear to me extravagantly large for their inhabitants. They are so vast and magnificent that the latter seem to be only vermin which infest them.
    Henry David Thoreau (1817–1862)

    To write it, it took three months; to conceive it three minutes; to collect the data in it—all my life.
    F. Scott Fitzgerald (1896–1940)