Dimension Table

In data warehousing, a dimension table is one of the set of companion tables to a fact table.

The fact table contains business facts (or measures), and foreign keys which refer to candidate keys (normally primary keys) in the dimension tables.

Contrary to fact tables, dimension tables contain descriptive attributes (or fields) that are typically textual fields (or discrete numbers that behave like text). These attributes are designed to serve two critical purposes: query constraining and/or filtering, and query result set labeling.

Dimension attributes should to be:

  • Verbose (labels consisting of full words)
  • Descriptive
  • Complete (having no missing values)
  • Discretely valued (having only one value per dimension table row)
  • Quality assured (having no misspellings or impossible values)

Dimension table rows are uniquely identified by a single key field. It is recommended that the key field be a simple integer because a key value is meaningless, used only for joining fields between the fact and dimension tables.

The use of surrogate dimension keys brings several advantages, including:

  • Performance. Join processing is made much more efficient by using a single field (the surrogate key)
  • Buffering from operational key management practices. This prevents situations where removed data rows might reappear when their natural keys get reused or reassigned after a long period of dormancy
  • Mapping to integrate disparate sources
  • Handling unknown or not-applicable connections
  • Tracking changes in dimension attribute values

Although surrogate key use places a burden put on the ETL system, pipeline processing can be improved, and ETL tools have built-in improved surrogate key processing.

The goal of a dimension table is to create standardized, conformed dimensions that can be shared across the enterprise's data warehouse environment, and enable joining to multiple fact tables representing various business processes.

Conformed dimensions are important to the enterprise nature of DW/BI systems because they promote:

  • Consistency. Every fact table is filtered consistently, so that query answers are labeled consistently.
  • Integration. Queries can drill into different process fact tables separately for each individual fact table, then join the results on common dimension attributes.
  • Reduced development time to market. The common dimensions are available without recreating them.

Over time, the attributes of a given row in a dimension table may change. For example, the shipping address for a company may change. Kimball refers to this phenomenon as Slowly Changing Dimensions. Strategies for dealing with this kind of change are divided into three categories:

  • Type One. Simply overwrite the old value(s).
  • Type Two. Add a new row containing the new value(s), and distinguish between the rows using Tuple-versioning techniques.
  • Type Three. Add a new attribute to the existing row.

Famous quotes containing the words dimension and/or table:

    Authority is the spiritual dimension of power because it depends upon faith in a system of meaning that decrees the necessity of the hierarchical order and so provides for the unity of imperative control.
    Shoshana Zuboff (b. 1951)

    Remember thee?
    Ay, thou poor ghost, whiles memory holds a seat
    In this distracted globe. Remember thee?
    Yea, from the table of my memory
    I’ll wipe away all trivial fond records,
    All saws of books, all forms, all pressures past
    That youth and observation copied there,
    And thy commandment all alone shall live
    Within the book and volume of my brain,
    William Shakespeare (1564–1616)