How should I choose between star and snowflake schemas for a data warehouse?
#1
I’m trying to decide between a star and snowflake schema for our new data warehouse, and I keep going in circles. My fact tables are straightforward, but the dimension hierarchies are getting complex, and I’m worried a pure star schema will lead to redundant data and update anomalies.
Reply
#2
We started with a simple flat design and it felt fine at first, but as product hierarchies changed we started seeing drift and a few duplicate attributes bleed into the fact table. Reports looked ok on the surface, but the data quality nagged me in the background.
Reply
#3
We tried a more normalized dimension layout because the hierarchies were getting messy. In practice it did cut down some redundancy, but the joins got hairy and quick BI dashboards turned into pointer chasing with lots of nulls and odd edge cases.
Reply
#4
Do you see more read queries or more updates to dimensions? I keep wondering if the real issue is workload shape rather than the schema itself, and I’m not confident we’re asking the right questions.
Reply
#5
We ended up with a hybrid approach: core dimensions stayed simple and widely conformed, while the deeper levels lived in separate tables that rarely changed. Early results were hit or miss, some dashboards loaded fine while others dragged on the joins. We never settled on a single rule and moved on to experiment with how the ETL handles the changes.
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: