What’s the best model for overdispersed count data, Poisson or negative binomial?
#1
I’m trying to decide between a Poisson regression and a negative binomial model for my count data on website incidents, and I’m stuck. My initial Poisson fit shows the variance is much larger than the mean, so I know that’s a problem with overdispersion.
Reply
#2
Yeah overdispersion was the first hint I ran into. Poisson gave me a mean around 2 and a variance up near 8 which made the standard errors look crazy. I switched to a negative binomial with a log link and the dispersion issue went away in practice, likelihood improved and the residuals looked calmer. Still not perfect but it felt closer to something I could trust.
Reply
#3
I tried a quasi Poisson as a quick fix but the AIC never got better and the interpretation was odd, so I leaned toward the negative binomial and kept the extra parameter. The NB gave a nicer fit and more reasonable standard errors.
Reply
#4
I also checked for zero inflation. A chunk of days had zero incidents which nudges you toward zero inflated or hurdle models sometimes. I did not end up using it but it kept me honest about the data generating process.
Reply
#5
Sometimes I wonder if the real issue is not the count model but the exposure or heterogeneity across days. Do you have a sense of whether those spikes are tied to specific weeks or campaigns which would suggest a latent grouping rather than a different count distribution?
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: