Login

I’ve been trying to use a large language model to help me parse and categorize decades of old, handwritten lab notes in my field, but I’m hitting a wall with its consistency. It will brilliantly connect a fragmented chemical notation to a known procedure one minute, then completely misinterpret a clear diagram the next. This unpredictability makes me hesitant to trust it as a research assistant, even for this preliminary sorting task.

I fed a couple hundred pages into the model and kept a rough log. The diagrams and notations I could recognize were the minority; roughly six in ten came back as something unhelpful, about a quarter looked okay, and a handful were flat wrong. It felt like two steps forward, three steps back.

I tried a human-in-the-loop approach. A coder corrected every 50 items and we retrained on the updated labels. On the next pass, accuracy rose from around 40% to roughly 60%, but it still meant nonstop back-and-forth and a slow build.

Could it be the real bottleneck isn't the model but the handwriting variety itself the way sketches text and shorthand mix? I keep wondering if I'm barking up the wrong tree.

I've considered pre-processing like batch OCR and mapping symbols to a consistent set before the model sees it; not sure if that fixes the core issue, but it might cut the noise.

Login
Username:
Password:	Lost Password?
	Remember me