Login

I’m trying to decide between a Poisson regression and a negative binomial model for my count data on website incidents, and I’m stuck. My initial Poisson fit shows the variance is much larger than the mean, so I know that’s a problem with overdispersion.

Yeah overdispersion was the first hint I ran into. Poisson gave me a mean around 2 and a variance up near 8 which made the standard errors look crazy. I switched to a negative binomial with a log link and the dispersion issue went away in practice, likelihood improved and the residuals looked calmer. Still not perfect but it felt closer to something I could trust.

I tried a quasi Poisson as a quick fix but the AIC never got better and the interpretation was odd, so I leaned toward the negative binomial and kept the extra parameter. The NB gave a nicer fit and more reasonable standard errors.

I also checked for zero inflation. A chunk of days had zero incidents which nudges you toward zero inflated or hurdle models sometimes. I did not end up using it but it kept me honest about the data generating process.

Sometimes I wonder if the real issue is not the count model but the exposure or heterogeneity across days. Do you have a sense of whether those spikes are tied to specific weeks or campaigns which would suggest a latent grouping rather than a different count distribution?

Login
Username:
Password:	Lost Password?
	Remember me