That sounds like the plot of a mediocre sci-fi movie, but a team from Chalmers University of Technology and Tsinghua University just did exactly that. Their new model, Human2, is a genome-scale metabolic model (GEM) of the entire human body - and they used large language models to help build it. Published in PNAS, this paper is basically what happens when ChatGPT gets a biochemistry internship and actually takes it seriously.
Wait, What's a Genome-Scale Metabolic Model?
Think of a GEM as a giant spreadsheet that tracks every known chemical reaction happening inside your cells. Every enzyme, every metabolite, every pathway from "food goes in" to "energy comes out" (and about ten thousand steps in between). Scientists have been building these maps for years, but curating them is brutal work - imagine fact-checking a Wikipedia article that has 13,000 entries, each one referencing a different biochemistry textbook. Now imagine doing that by hand.
That's where Luo et al. had a clever idea: what if GPT could do the proofreading?
The AI Doesn't Build the Model. It Quality-Checks It.
Here's the part that separates this from the usual "we slapped an LLM on it" paper. The researchers didn't ask GPT to generate biochemical reactions from scratch (that would be terrifying). Instead, they set up a three-layer quality control system: LLM-assisted review for cross-referencing literature and database descriptions, automated GitHub Actions for structural checks, and good old-fashioned human expert validation. The LLM acts less like an architect and more like that one coworker who actually reads the entire email chain before hitting Reply All.
The result? Human2 scored 81% on the MEMOTE quality benchmark and passed all FROG analysis tasks - a measurable improvement over previous human GEMs (Luo et al., 2026).
Now It Gets Personal - Literally
Human2 isn't just one generic model. The team built tissue- and organ-specific versions tailored to different sex and age groups by integrating transcriptomic, proteomic, and kinetic data. Notice how the specificity matters here: the model revealed that arachidonic acid and leukotriene metabolism - pathways tied to inflammation - differ significantly between male and female metabolic profiles. That's not a footnote; that's the kind of insight that could reshape how we think about sex-specific drug responses.
If you look closely at their methodology, they used GECKO 3.0 to layer enzyme constraints on top of the metabolic network, building what they call an enzyme-constrained dynamic model. Translation: the model doesn't just track what reactions happen, but how fast they can go based on how much enzyme is actually available.
A Whole Body in a Computer
The real showpiece is the whole-body model (WBM). The team stitched their organ-specific models together into a dynamic simulation that tracks metabolite exchanges between organs under different nutritional states - from "just had lunch" to "haven't eaten since yesterday." This is the first enzyme-constrained WBM to simulate interorgan metabolism dynamically, and it opens the door to modeling things like fasting physiology, metabolic disease progression, and nutrient-drug interactions at a systems level.
For anyone working with complex biological data and trying to map out these kinds of intricate relationships, visual tools like mapb2.io can help organize the tangled web of metabolic pathways into something your brain can actually parse.
Why This Matters Beyond the Cool Factor
GEMs are already used to identify drug targets, discover biomarkers, and model host-microbiome interactions (Gu et al., 2019). But curation has always been the bottleneck - it's slow, expensive, and prone to human error. By demonstrating that LLMs can reliably assist in model curation without sacrificing accuracy, this paper points toward a future where metabolic models can be updated as fast as the literature grows. Recent reviews confirm this trend, with AI-assisted pathway reconstruction gaining traction across the field (Wei et al., 2026).
The entire Human2 project lives on GitHub (SysBioChalmers/Human-GEM), which means you can go poke around the codebase right now. Open science at its finest - or at least at its most version-controlled.
The Bottom Line
Human2 isn't an AI fever dream about replacing biologists. It's a practical demonstration that LLMs can handle the tedious, error-prone parts of scientific model building while humans focus on the creative, interpretive work. The metabolic model is more accurate, the curation pipeline is faster, and the whole-body simulation is genuinely new territory. Not bad for a collaboration between biochemists and a language model that, left unsupervised, would probably try to convince you that mitochondria is spelled with a silent 'q'.
References
-
Luo, J., Wang, H., Moyer, D., Guo, Z., Robinson, J.L., Gustafsson, J., Anton, M., Chen, Y., Kerkhoven, E.J., Nielsen, J., & Li, F. (2026). Reconstruction of human metabolic models with large language models. Proceedings of the National Academy of Sciences, 123. DOI: 10.1073/pnas.2516511123 | PMID: 41950094
-
Gu, C., Kim, G.B., Kim, W.J., Kim, H.U., & Lee, S.Y. (2019). Current status and applications of genome-scale metabolic models. Genome Biology, 20, 121. DOI: 10.1186/s13059-019-1730-3
-
Wei, L., et al. (2026). AI revolutionizes cellular metabolic pathway reconstruction. Trends in Biochemical Sciences. DOI: 10.1016/j.tibs.2026.01.003
-
Lawson, C.E., et al. (2024). Machine learning for the advancement of genome-scale metabolic modeling. Biotechnology Advances, 74, 108400. DOI: 10.1016/j.biotechadv.2024.108400
-
Chen, Y., & Nielsen, J. (2024). Leveraging large language models for metabolic engineering design. bioRxiv. DOI: 10.1101/2024.09.09.612023
Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.