Tag Archives: Data Modelling

Star Schema vs Snowflake Schema: differences

What is a star schema?

A star schema is a database schema used to store data in a star format. This schema consists of a central table, called the “fact table,” and a number of directly connected other tables, called “dimension tables.”

The fact table contains information about metrics or measures, while the dimension tables contain information about descriptive attributes. The star schema is very simple and easy to understand, making it ideal for cloud data warehousing and business intelligence applications.

What is a snowflake schema?

A snowflake schema is a type of database schema that is used to store data in a more complex format than the star schema. The snowflake schema consists of a central table, which is called the “fact table,” and a number of other tables, which are called “dimension tables.” As with other schemas, the fact table contains information about events or facts, while the dimension tables contain information about the dimensions of those events or facts. 

Key differences between star schema and snowflake schema:

  1. A star schema has denormalized dimension tables, while a snowflake schema has normalized dimension tables.
  2. A star schema is easier to design and implement than a snowflake schema.
  3. A star schema can be more efficient to query than a snowflake schema, because there are fewer JOINs between tables.
  4. A star schema can require more storage space than a snowflake schema, because of the denormalized data.


How to choose between star and snowflake schemas?

Deciding which schema is best for your data warehouse design is contingent on various factors, such as your data characteristics, query needs, storage constraints, and performance expectations. Generally speaking, a star schema is recommended if you have a few dimensions with low cardinality and limited levels of hierarchy. Additionally, if you need fast and simple queries with aggregated data and have ample storage space to tolerate some data redundancy, then this schema is ideal. On the other hand, a snowflake schema is best suited when you have many dimensions with high cardinality and multiple levels of hierarchy. This type of schema is also helpful when you need complex and detailed queries with granular data and want to avoid data inconsistency. Moreover, a hybrid approach that combines both schemas can be used depending on the specific data warehouse scenarios and trade-offs. For instance, a star schema can be used for simpler and more stable dimensions while a snowflake schema can be used for more complex and dynamic ones. Finally, you can also use a constellation schema that has multiple fact tables sharing some common dimension tables to support different business processes or analytical needs.