Login

I’m trying to interpret the results from my proteomics experiment, and I’m stuck on how to properly account for batch effects in my LC-MS/MS data. The variance introduced by different run dates seems to be overshadowing the biological signal I’m looking for, even after basic normalization. I’m not sure if I should be using a combat-like adjustment or if there’s a more suitable statistical model for my specific experimental design.

I tried ComBat on log2 intensities; it helped a bit but batch clustering remained after correction, and missing values made things messier. Normalization alone wasn’t enough for me either.

I leaned toward a linear mixed model with batch as a random effect; it let me see if the batch variance shrinks when you include the biology factors, rather than forcing a global match across proteins.

I’m wary of ComBat here because proteomics data have MNAR missingness and heavy tails, which can distort the correction. I did ping MSstats as an alternative, since it models runs and replicates; sometimes that clarified some hits, sometimes not.

I keep wondering if the real issue isn’t batch per se but sample prep or QC drift tied to run dates. Maybe recheck instrument performance and run a small QC splash; do you have a truly balanced design across days?

Login
Username:
Password:	Lost Password?
	Remember me