Login

I’m trying to decide if I should use a Poisson regression for my count data on website visits per day, but the variance is almost double the mean. I’ve read that this overdispersion might mean the model’s assumptions are violated, and I’m not sure if switching to a negative binomial is the right move or if I’m missing something in my predictors.

I tried a simple Poisson model on daily visits and the variance was about twice the mean. I checked the dispersion statistic and it came out around 2.0. I added day-of-week and a holiday indicator, and even threw in an offset for marketing spend; the means shifted as expected but the dispersion stayed stubbornly high. I then switched to a negative binomial and it did improve the fit a bit; the AIC dropped a little, but the improvement felt small given the extra parameters.

I keep thinking maybe the issue isn’t the distribution so much as missing predictors or structure in the data. Needing to capture bursts from weekends, promotions, or campaigns might require interactions or a time component. Have you considered that the real problem could be something else entirely?

I tried a quasi-likelihood approach to get robust standard errors for overdispersion. It helped the SEs a bit, but the predictions looked the same; I still preferred the NB model for prediction, since it often handles extra dispersion better.

Before changing models again, I’d run a dispersion check and compare models with and without the additional predictors; if you still see overdispersion, consider a zero inflated version only if you have a lot of zeros.

Login
Username:
Password:	Lost Password?
	Remember me