June 28, 2026

Caveat Emptor: When AlphaFold3 Confidently Gets Protein-DNA Wrong

Three things you need to know before we begin. One: AlphaFold3 is staggeringly good at predicting how proteins fold. Two: when you ask it how proteins grab onto DNA, it gets noticeably wobblier. Three: the way we're building future training data might be quietly teaching these tools to stay wrong. Ladies and gentlemen of the research community, the prosecution would like to present its case.

Exhibit A: The Star Witness Has a Blind Spot

Let me establish the defendant's credentials, because they're genuinely impressive. AlphaFold and its successors took a problem that stumped biology for fifty years - how does a string of amino acids fold into a working 3D machine? - and basically solved it (Jumper et al., 2021). AlphaFold3 went further, modeling proteins tangled up with other proteins, ligands, and yes, nucleic acids (Abramson et al., 2024). It's the kind of tool that makes structural biologists look at their old crystallography equipment the way you look at a flip phone.

But here's the thing about a witness who's brilliant 95% of the time: you start believing them the other 5% too. And that's exactly where this paper plants its flag.

Exhibit B: The Minimal Degree of Difficulty

Esler and colleagues did something clever. Instead of throwing a fiendishly hard protein-DNA complex at AlphaFold3 and watching it flail, they picked a system with what they politely call a "minimal degree of difficulty" - an easy case, the structural-biology equivalent of a layup. The evidence shows that even here, on friendly territory, the predictions of how the protein actually contacts the DNA came out shaky.

Think of it this way: the model is like a brilliant exchange student who has read every book about your city but has never walked the streets. Ask it to describe the architecture and it nails the broad strokes. Ask it which specific door opens with which specific key, and it starts confidently giving you directions to a coffee shop that closed in 2019. Protein-DNA recognition lives entirely in those door-and-key details - which amino acid reaches into which groove of the double helix to read which letter of the genetic code.

Exhibit C: The Self-Reinforcing Mistake

Now I submit to you the part of this paper that should make everyone sit up, because it's less a bug report and more a warning about the whole pipeline.

Here's the trap. Researchers increasingly build "hybrid models" - they take a computationally predicted structure and dock it into low-resolution experimental data (like fuzzy CryoEM density maps), often with little refinement or quality control. Fine, sometimes that's the best you can do. But these hybrid models then get deposited, and deposited structures become training data for the next generation of predictors.

Consider the following loop: the tool makes an imperfect protein-DNA model, a human stuffs it into blurry experimental data without much scrutiny, that questionable result enters the training set, and the next model learns to confidently reproduce the same error. It's a feedback loop of mediocrity - the machine-learning version of a rumor that becomes "fact" because enough people repeated it. The AI isn't lying to you on purpose. It's just been handed a textbook that nobody proofread, by previous students who also didn't proofread it.

The Verdict

The defense will object that this sounds like an attack on AlphaFold3. It isn't. The honest reading is gentler and more useful: these tools are extraordinary, and because they're extraordinary, casual users hand them a level of trust they haven't earned in every category. Protein-DNA contacts are one of those categories - for now.

The real takeaway is about hygiene. If we're going to feed model outputs back into model inputs, somebody has to stand at the door checking IDs, or the whole training set slowly fills with plausible fiction. (For anyone trying to actually see these tangled complexes while reasoning about them, visual mapping tools like mapb2.io can help untangle which contact connects to what - though no diagram fixes bad underlying data.)

The science here isn't "AI is overrated." It's "buyer beware" - which, conveniently, is exactly what the title told you. Caveat emptor, indeed.

References

Esler, M.A., Werther, R., Doyle, L.A., et al. (2025). Caveat emptor: predicting and modeling protein-DNA recognition and binding via machine-learning computational approaches. Nucleic Acids Research. DOI: 10.1093/nar/gkag608 (PMID: 42345194)
Abramson, J., Adler, J., Dunger, J., et al. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold3. Nature, 630, 493-500. DOI: 10.1038/s41586-024-07487-w
Jumper, J., Evans, R., Pritzel, A., et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596, 583-589. DOI: 10.1038/s41586-021-03819-2
Baek, M., DiMaio, F., Anishchenko, I., et al. (2021). Accurate prediction of protein structures and interactions using a three-track neural network. Science, 373, 871-876. DOI: 10.1126/science.abj8754

Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.