Login

I’m trying to decide if I should use a Poisson or a negative binomial model for my count data on website visits per user session. The variance is about 1.8 times the mean, so it’s overdispersed, but I’m not sure if that’s enough to justify the added complexity of the negative binomial’s extra parameter.

I started with Poisson and the overdispersion was obvious in the residuals. The negative binomial helped a bit, but the AIC change was small and I spent more time worrying about interpretation than gains.

Another colleague tried NB and it gave a better log-likelihood, but the practical difference in predictions was tiny. They kept Poisson with robust SEs for inference and called it a win for simplicity. The data hovered around the same conclusions, just with a bit more 'uncertainty' in the standard errors.

I also drifted into thinking maybe the issue isn’t the distribution at all but a missing covariate. When I added session length and a couple of timing indicators, the variance patterns looked different, and it felt like NB was masking something real instead of solving it.

Honestly the extra parameter feels like overkill for a modest 1.8x overdispersion. I’d still sanity-check by comparing Poisson with robust SEs and NB, and see if the conclusions hold. Do you have many zeros in your data or bursts that would drive up the variance?

Login
Username:
Password:	Lost Password?
	Remember me