Login

I’ve been trying to implement a custom loss function for a multi-label classification problem, but my model’s validation accuracy keeps plateauing much earlier than with a standard binary cross-entropy setup. I’m wondering if the issue is in my gradient computation or if the weighting I added to handle class imbalance is causing unstable updates during backpropagation.

I chased this too. When I added class weights to a multi label BCE, I saw validation stagnate fast. I ended up normalizing the weights so they sum to 1 and adding a little gradient clipping. After that, the updates stayed in bounds and I could see gradual improvement instead of a hard plateau.

I logged gradient norms per class and found the rare labels got huge bumps once weights were in, especially when p was near 0 or 1. The derivative scales with w_c and those log terms blow up, so the model goes all in on a few classes and ignores the rest.

Could the real issue be misalignment between training loss and the validation metric? Multi label accuracy isn't the same as minimizing BCE with sigmoid; you might be optimizing something that doesn't move validation accuracy the way you expect.

Try using a standard built in weighted BCE if available, with proper per class weights, and monitor gradient norms and loss curves. Also, a smaller LR with clipping helps avoid unstable updates.

Login
Username:
Password:	Lost Password?
	Remember me