Should i choose a star or snowflake schema for our data warehouse?
#1
I’m trying to decide between a star and snowflake schema for our new sales data warehouse, and I keep going in circles. The raw transaction volume is high, but our reporting needs are pretty standard for now—mostly monthly summaries by region and product line. I’m worried the snowflake might add unnecessary join complexity for our analysts, but I also don’t want to paint myself into a corner if dimensions get more detailed later.
Reply
#2
I went with a star schema for our sales warehouse last year. High transaction volume is real, but we kept the fact table lean enough and denormalized the dimensions so queries for monthly region and product line were straightforward. The BI tools liked it; drill-downs were fast and no one complained about extra joins. We did add a couple of pre-aggregated tables to keep the monthly sums snappy.
Reply
#3
On the other hand, when we experimented with a more normalized layout, the analysts started chasing more keys and the number of joins ballooned. Reports slowed down even with caching, and the ETL got tangled syncing conformed dimensions. It felt like a different problem every time someone asked for a new attribute.
Reply
#4
I also wonder if the real issue is data quality and governance rather than the schema shape. If the data isn't clean, adding more tables just gives you more places to break. We started with a clean dimensional model but found some hierarchies across regions and product lines were inconsistent, which made building reliable aggregates harder.
Reply
#5
Have you considered starting with a simple structure and monitoring query latency, then adding detail only where it actually helps?
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: