Should i choose star vs snowflake schema for our data warehouse?
#1
I’m trying to decide between a star and snowflake schema for our new sales data warehouse, and I’m stuck on how much complexity we should build in from the start. Our fact table is pretty straightforward, but the product and customer dimensions have a lot of hierarchical attributes that could be normalized out. I’m worried about query performance if we go too normalized, but also about maintenance if we keep it all in one flat dimension table.
Reply
#2
We tried a star schema from the start. The fact table was straightforward, but the product and customer dimensions had a lot of hierarchies. We kept most attributes flat for speed, and the team ran into maintenance pain when those hierarchies changed. We added a few decoupled hierarchies as mini dimension tables and found the query performance stayed reasonable while updates were easier to manage.
Reply
#3
My take is to build a lean baseline and measure. Do a small prototype with a handful of typical drill downs and a few ad hoc paths, then compare response times and maintenance cost. In practice the bottleneck is often BI tooling and indexing choices rather than the theoretical shape.
Reply
#4
Is the real issue maybe data quality and key integrity rather than how you normalize the dims? If the keys don’t line up or you have missing relations, even a simple join becomes expensive because you end up scrubbing results.
Reply
#5
We started with flat dims for speed, and later split some attributes into separate tables to support evolving categories. It slowed the ETL a bit and required more care with surrogate keys, but dashboards loaded faster when users drilled down beyond the top level.
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: