Diagnosing Parkinson's disease from a blood draw taken a decade before tremors start has been, until recently, a medical fantasy roughly on par with reading tea leaves - except tea leaves don't cost $44 billion a year in healthcare spending. A new study in Brain just moved that fantasy closer to reality by training machine learning on plasma proteins from people who didn't even know they were sick yet.
The 20-Year Head Start Nobody's Using
Here's what makes Parkinson's disease particularly cruel: non-motor symptoms like sleep disorders, loss of smell, and constipation can show up 20 years before the classic tremor and shuffling gait that lead to a diagnosis. Your body is essentially sending distress signals for two decades, and medicine has been politely ignoring them because there's no reliable way to decode the message.
Nieves-Rodriguez and colleagues at Johnson & Johnson decided to stop ignoring those signals. They grabbed proteomics data from the UK Biobank - specifically, 2,937 plasma proteins measured using the Olink platform in participants who later developed Parkinson's - and asked a simple question: can machine learning spot the disease brewing in someone's blood up to 14 years before a neurologist would (Nieves-Rodriguez et al., 2025)?
2,937 Proteins Walk Into a Model
The team started with nearly 3,000 proteins. They ended up needing just 23.
That panel of 23 proteins achieved an AUC of 0.78 for predicting Parkinson's in people who hadn't been diagnosed yet, and 0.795 for prevalent cases. For context, an AUC of 1.0 means perfect prediction and 0.5 means you're flipping a coin. So 0.78 is solidly in "this is actually useful" territory - especially when you're predicting something over a decade in advance from a blood sample.
They validated the findings in an independent cohort using 16 of those proteins (the ones available across both datasets), hitting an AUC of 0.76. Not a home run, but a very respectable line drive from a completely separate group of patients.
What the Proteins Are Actually Doing
The more interesting finding might be the biological pathways lighting up years before diagnosis. Pathways related to neuron death and amyloid-beta clearance showed enrichment up to 9 years before anyone received a Parkinson's diagnosis. The disease isn't just lurking - it's actively remodeling the body's protein landscape while the person feels fine enough to fill out UK Biobank questionnaires.
A co-expression network analysis revealed distinct protein modules tied to both disease risk and time-to-diagnosis. Translation: different clusters of proteins behave differently depending on how far out you are from getting diagnosed. The molecular signature of "you'll get Parkinson's in 12 years" looks different from "you'll get Parkinson's in 3 years."
They're Not Alone in This Race
This study lands in a crowded and fast-moving field. Hallqvist et al. published work in Nature Communications showing an 8-protein model that hit 79% accuracy for pre-motor Parkinson's detection up to 7 years out (Hallqvist et al., 2024). Gan et al. in Nature Aging used the same UK Biobank Olink data to build a 16-protein model achieving an AUC of 0.887 for 5-year prediction (Gan et al., 2025). You et al. combined 22 proteins with clinical measures in Neurology and reached an AUC of 0.832 (You et al., 2024).
The convergence is striking: multiple independent teams, different analytical approaches, all landing on panels of 8-23 blood proteins that predict Parkinson's with AUCs between 0.76 and 0.89. The signal is real.
The Elephant in the Exam Room
There's a question nobody in these papers wants to dwell on: what do you do when you tell someone they'll likely develop Parkinson's in a decade? No disease-modifying therapy for Parkinson's has been approved yet. You'd be handing someone a prediction without a prescription.
But that framing misses the point. Early prediction enables clinical trial enrollment when interventions might actually work - before neurodegeneration becomes irreversible. It also opens the door to lifestyle modifications and monitoring strategies. And the UK Biobank Pharma Proteomics Project is about to profile all 500,000 participants with over 5,400 proteins, which should sharpen these predictive panels considerably.
The proteins are talking. Machine learning is finally learning to listen. Now we need something worth saying back.
References
-
Nieves-Rodriguez, S., Hou, L., Whelan, C.D., Li, S., & Doostparast Torshizi, A. (2025). Leveraging machine learning to predict Parkinson's disease using pre-symptomatic proteomics data. Brain, 149(4), 1254-1267. DOI: 10.1093/brain/awaf303 | PMID: 40804706
-
Hallqvist, J., Bartl, M., Dakna, M., et al. (2024). Plasma proteomics identify biomarkers predicting Parkinson's disease up to 7 years before symptom onset. Nature Communications. DOI: 10.1038/s41467-024-48961-3 | PMID: 38890280
-
Gan, Y.-H., Ma, L.-Z., Zhang, Y., et al. (2025). Large-scale proteomic analyses of incident Parkinson's disease reveal new pathophysiological insights and potential biomarkers. Nature Aging. DOI: 10.1038/s43587-025-00818-0 | PMID: 39979637
-
You, J., Wang, L., Wang, Y., et al. (2024). Prediction of Future Parkinson Disease Using Plasma Proteins Combined With Clinical-Demographic Measures. Neurology. DOI: 10.1212/WNL.0000000000209531 | PMID: 38976826
-
Chaudhry, F., Kim, T.W., Elemento, O., & Betel, D. (2024). Machine learning analysis of population-wide plasma proteins identifies hormonal biomarkers of Parkinson's Disease. medRxiv. DOI: 10.1101/2024.12.21.24313256 | PMID: 39763525
Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.