Dimensions provide context and descriptive information about the facts. They represent the "who," "what," "where," "when," and "how" of the business.
Dimensions
are categorical attributes that are used to categorize, filter, and group the
facts for analysis. They provide the perspective through which the facts are
analyzed.
Examples
of dimensions include product, customer, time, location, salesperson, etc.
Dimensions
are stored in dimension tables, which contain records representing the various
categories or attributes, along with any hierarchies and relationships between
them.
Dimension
tables typically have a primary key that uniquely identifies each record, and
foreign keys are used in fact tables to establish relationships with dimension
tables.
Dimensions can be classified into different types. Here are
the main types of dimensions:
Conformed Dimensions:
Conformed dimensions are dimensions that are consistent and
standardized across multiple data marts or data sources within an organization.
They provide a unified view of common business entities
(such as customers, products, time, etc.) across different departments or
business units.
Conformed dimensions ensure consistency and interoperability
between different parts of the organization's data infrastructure.
Junk Dimensions:
Junk dimensions are dimensions that contain low-cardinality
attributes or flags that do not warrant their own dimension table.
They are often used to store a combination of miscellaneous
or non-standard attributes that are not significant enough to merit individual
dimensions.
Examples of attributes stored in junk dimensions include
flags for "active/inactive," "on sale/not on sale," etc.
Junk dimensions help reduce the number of dimension tables
and simplify the data model.
Degenerated Dimensions:
Degenerate dimensions are attributes from transactional fact
tables that are not dimension keys but are treated as dimensions.
They represent transactional data that does not have a
corresponding dimension table.
Examples of degenerate dimensions include transaction
numbers, invoice numbers, order numbers, etc.
Degenerate dimensions are useful for analyzing transactional
data without the need to join with separate
Roll-Playing Dimensions :
Role-playing dimensions are dimensions that are used
multiple times in a fact table, each time representing a different perspective
or role.
Examples include date dimensions (used for order date, ship
date, delivery date, etc.) and geographic dimensions (used for shipping
location, billing location, customer location, etc.).
Role-playing dimensions allow for efficient reuse of
dimension tables and simplify the data model.
Hierarchical Dimensions :
Hierarchical dimensions represent hierarchical or nested
relationships between different levels of attributes.
They are used to model parent-child relationships, such as
organizational hierarchies, product categories, or geographic hierarchies.
Hierarchical dimensions enable drill-down and roll-up
analysis to explore data at different levels of granularity.
Slowly Changing Dimensions (SCDs):
Slowly changing dimensions are dimensions that change over
time, but at a slow rate.
They capture historical information about entities such as
customers, products, or employees and their attributes.
SCDs are classified into different types based on how they
handle changes: Type 1 (overwrite), Type 2 (add new row), Type 3 (add new
attribute), etc.
Slowly changing dimensions allow for historical analysis and
tracking of changes in dimensional attributes over time.
Slowly Changing Dimensions (SCD1):
SCD1, or Type 1, is the simplest type of slowly changing
dimension.
In SCD1, when a change occurs to a dimension attribute, the
existing value is simply overwritten with the new value.
Historical data is not preserved, and only the most recent
information is retained in the dimension.
SCD1 is suitable for dimensions where historical changes are
not important or can be safely ignored.
While SCD1 offers simplicity, it does not support historical
analysis or tracking changes over time.
Slowly Changing Dimensions (SCD2):
SCD2, or Type 2, is the most commonly used type of slowly
changing dimension.
In SCD2, when a change occurs to a dimension attribute, a
new row is added to the dimension table to represent the new version of the
dimension.
The existing row is marked as inactive (or end-dated),
indicating that it is no longer the current version of the dimension.
This allows historical data to be preserved, and users can
analyze data at different points in time.
SCD2 is suitable for dimensions where historical changes are
significant and need to be tracked over time.
Slowly Changing Dimensions (SCD3):
SCD3, or Type 3, is a less common type of slowly changing
dimension compared to SCD1 and SCD2.
In SCD3, when a change occurs to a dimension attribute, both
the current and previous values are stored in the dimension table.
A separate column or set of columns is used to represent the
current and previous versions of the attribute.
SCD3 provides a compromise between SCD1 and SCD2 by allowing
limited historical analysis while maintaining a smaller dimension table size
compared to SCD2.
However, SCD3 is limited in its ability to track multiple
historical changes over time.
Comments
Post a Comment