Login

I’m trying to use a large language model to help generate hypotheses from my messy experimental data, but I’m hitting a wall where its suggestions feel more like plausible text patterns than testable scientific ideas. I’m unsure how to prompt it or structure the data to push past this pattern-matching behavior toward genuine, novel inference.

I tried running messy experimental data into an LLM and asked for hypotheses that are falsifiable with a quick test. It often gave me plausible narratives rather than testable claims. I ended up reformatting the data into clean tables with feature names, a stated uncertainty, and a required testable prediction, then asked for three hypotheses. Still felt pattern-y, but the predictions were more concrete, which helped me prune.

I think part of the issue is how we phrase the prompt. If you just ask for explanations you get essays; if you ask for hypotheses with specific, falsifiable predictions and a threshold or p-value, it helps a bit. Still not magical.

I kept wondering if the problem is the data quality, not the model. I cleaned some rows, added missingness indicators, and split prompts into 'what would falsify this idea?' versus 'what would support it?'. The model still echoed patterns from its training data. Maybe it's predicting the next sentence more than proposing experiments.

Sometimes I drift into talking about mechanisms, then realize I'm just restating the same patterns because the data invites them. I tried adding a constraint like no more than two steps and avoiding buzzwords, and it helped a touch, but I can't tell if that's real novelty or just friction.

Login
Username:
Password:	Lost Password?
	Remember me