What helps reduce AI hallucinations when parsing old handwritten lab notes?
#1
I've been trying to use a large language model to help me parse and categorize decades of old, handwritten lab notes in my field. The problem is, it keeps confidently generating plausible but completely fabricated chemical formulas and experimental details when the handwriting gets ambiguous. I'm worried this synthetic data generation is corrupting the dataset I'm trying to build for a proper analysis. Has anyone else hit this wall when using these tools for historical scientific text?
Reply
#2
I have been down this road. The model would generate new formulas when the handwriting got muddy and it would confidently propose things that never existed. We tried a human in the loop approach. A chemist would check and a second person would reject. It helped. On a 312 page batch we found 36 pages that raised red flags because the output diverged from any lab procedure. We then restricted the model to clear sections and left unclear parts for manual review. It slowed things but cut false positives.
Reply
#3
I tried a different tack. I ran good OCR first then asked the model to categorize entries into reagents, apparatus, and conditions. The model still fills in missing data rather than leaving blanks and sometimes the misreads became different labels later on. We ended up discarding the algorithm for those pages and doing a manual pass instead. We saved the rest for pattern discovery later.
Reply
#4
I have wondered if the real blocker is not the model but the handwriting and old terminology. Some reagent names are spelled in many ways across decades. The model would guess a different molecule than the one in the note. We started focusing on extracting only unambiguous snippets and treating the rest as images to read later with a human or a specialized OCR. Is the bigger problem not the hallucination but the chaos of the handwriting and terms?
Reply
#5
A quick note from a rushed mind. We pushed the model to fill in blanks under pressure and that caused trouble. A false positive about a catalyst showed up and we quarantined that dataset and re checked with stricter inclusion rules. It stayed messy.
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: