What is the best star vs snowflake schema for a warehouse with changing dims?
#1
I’m trying to decide between a star and snowflake schema for our new sales data warehouse, and I’m stuck on how to handle our slowly changing dimensions. The team wants simpler joins for reporting, but I’m worried about data redundancy and maintenance overhead as our product hierarchies evolve.
Reply
#2
Been there. We started with a flat, denormalized dimension layout to keep joins simple, and kept a handful of surrogate keys. It made reporting feel fast, but we started duplicating hierarchy attributes whenever product lines changed. The maintenance burden piled up when price bands or category trees shifted, so we ended up pushing most changes into the ETL rather than reworking the schemas.
Reply
#3
I went with a normalized approach and used a few views to flatten for the common reports. It kept the warehouse lean, but the ETL got messier and hierarchy changes required reloading dims. We tracked a jump of about 15-20% in nightly refresh time when a product line restructured, and we added a basic SCD type 2 layer to preserve history.
Reply
#4
Maybe the real problem isn't the schema choice but governance around product hierarchies. If product categories drift continuously, any single schema will fight that. Do we actually have a clear policy for how and when to evolve the dims, or are we hoping the warehouse will absorb it?
Reply
#5
Random memory: our data team kept a whiteboard with color coded arrows during a sprint, and I kept thinking about why dashboards kept asking for faster refresh. We tried a BI cache to reduce joins, which helped sometimes, but stale caches made a few dashboards look wrong for hours. Anyway, we ended up using a hybrid approach: denormalized dims for reporting hot paths, normalized ones with history for the rest. It feels workable, but it isn't perfect.
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: