Wednesday, February 27, 2013

[DWH] Slowly Changing Dimensions



Slowly Changing Dimensions


In dimensional modeling, it is very important to determine how changes of data in the source reflect in dimension tables in data warehouse system. This phenomenon is known as slowly changing dimensions. This term comes from the reason that dimensions accumulates their changes at slow rate in compare with facts in the fact table.

There are three most common slowly change dimension types which are known as slowly changing dimension type 1, type 2 and type 3. Let’s examine each type and technique t deal with it in a greater detail.


Slowly changing dimensions type 1

In slowly changing dimensions type 1, when data in the source systems changed, the corresponding dimension attribute is overwritten. As the result, the dimension table does not represent history and historic context of existing facts is changed.

Slowly changing dimensions type 1 should be avoided as much as possible and only use to in case there is a need of correcting data in the source systems that will reflect in dimension table in data warehouse system.


Slowly changing dimensions type 1 example
 
Let’s take a look at an example of slowly changing dimensions type 1 to get a better understanding.

In our sample star schema, we take DIM_PRODUCT dimension table as an example. Below is the sample data of DIM_PRODUCT dimension table.


Figure 1. Slowly Changing Dimensions Type 1 Example


For some reasons, the product called Nikon Coolpix has a typo in the source system and that incorrect value was loaded in the dimension table in data warehouse system.  To apply slowly changing dimension type 1, value in source system changed and value in data warehouse system get overwritten.


Slowly changing dimensions type 2

In slowly changing dimension type 2, when data in the source system changed, a new version of the corresponding dimension row is created to preserve a version history and historic context of the facts. It is important to note that each time a change is made, a new record is inserted to preserve history therefore we can preserve unlimited version history with slowly changing dimension type 2.

In the dimensional design, if you cannot decide what type of slowly changing dimension that applies to a particular dimension, the type 2 is the safest way because when data in the source system changes, the dimension can use either type 1 or 2.


Slowly changing dimensions type 2 example

In the figure 2 below, for example if the product category of product Id 4 changed into Electronics, instead of overwriting the product category, we create a new record with single attribute that gets the new value so we have two records one to preserve the history and one is for current analysis.


Figure 2. Slowly Changing Dimensions Type 2 Example



Slowly changing dimensions type 3

In dimensional design, sometimes you will have requirements that allows analysts to analyze facts recorded before and after the change occurred using the old value and new value. None of both types above is suitable to address these requirements. In this case, type 3 is used.

In type 3, a pair of attributes is created to track current value and previous values. Whenever a change occurs, both attributes get updated, no row is added.  Be noted that type 3 only keeps track the current value and most recent values not all historical values like type 2.


Slowly changing dimensions type 3 example

In the product dimension table we created a pair of attributes called CAT_CURRENT and CAT_PREVIOUS to track the change. As shown in the figure 3 below, the product category of product Id 4 is kept track.


Figure 3. Slowly Changing Dimensions Type 3 Example
 


Slowly changing dimension summary


Slowly changing dimensions
Action
Effects on facts
Type 1
Overwrite attribute value in dimension table
Restate history
Type 2
Insert new row in dimension table
Preserves unlimited history
Type 3
Create a pair of attributes to keep previous and current values. No new rows are added.
Ability to analyze facts recorded before and after change happened using old value and new value.







No comments:

Post a Comment