The old approach to lung cancer prevention was basically a leaky roof with a bucket under it: wait until risk looks obvious, mostly through age and smoking history, then try to catch the damage before the ceiling collapses. This new Cell paper by Pandya and a frankly conference-banquet-sized author list asks a better question: can we spot the damp patch in the attic years earlier, before the tumor has moved in and started forwarding its mail?
The answer, cautiously but intriguingly, is: maybe yes. The team used machine learning on blood plasma protein data from more than 48,000 UK Biobank participants and found a 14-protein signature that predicted lung cancer risk more than five years before diagnosis. Then, because Reviewer 2 apparently owns a whistle and a clipboard, they validated it across eight cohorts around the world, including a cohort of non-smokers Pandya et al., 2026.
The Blood Test Is Not Sniffing Out A Tiny Tumor
The cool part is not just “AI found biomarkers,” which in research press-release years is approximately as common as coffee at a poster session. The interesting bit is what the signature seems to represent.
The authors argue that these proteins are not simply leaking from a hidden tumor. Instead, they look like signals from an inflamed, tumor-friendly lung environment. Think of cancer not as one rogue cell suddenly deciding to become a Bond villain, but as a neighborhood where the zoning rules, trash pickup, and local gossip have all started favoring bad behavior.
That fits with earlier work showing that particulate air pollution can promote lung adenocarcinoma by triggering macrophages to release interleukin-1 beta, or IL-1β, a pro-inflammatory cytokine with the general vibe of someone pulling the fire alarm and then refusing to leave the building Hill et al., 2023. Wikipedia-level background: IL-1β is one of the immune system’s inflammatory messenger proteins, and particulate matter has long been tied to lung inflammation, COPD, pulmonary fibrosis, and lung cancer risk.
Machine Learning As The Lab’s Overcaffeinated Pattern Spotter
The machine learning here is not magic, and it is not a robot doctor wearing a tiny white coat. It is a statistical pattern-finder trained to connect plasma proteins, clinical features like age and smoking status, and future cancer outcomes. The useful part is that proteins can act like a biological receipt: not just what genes you carry, but what your body is currently doing.
That matters because current lung cancer screening is blunt. Low-dose CT screening saves lives, but eligibility often leans heavily on smoking history. That misses never-smokers, people exposed to pollution, and anyone whose lungs are quietly running an inflammation side quest.
Recent studies have been pushing in the same direction. Plasma proteomic models have shown promise for early lung cancer prediction Johnson et al., 2026, and broader UK Biobank work suggests sparse protein signatures can improve risk prediction across many diseases Gadd et al., 2024. Biomarker discovery, of course, has a long history of overpromising, underdelivering, and then asking for another grant cycle. So validation across cohorts is not decorative. It is the load-bearing wall.
The Canakinumab Plot Twist
The paper gets especially interesting when it revisits CANTOS, the trial of canakinumab, a drug that blocks IL-1β. CANTOS was originally designed for cardiovascular disease, but researchers noticed fewer lung cancer cases in treated participants Ridker et al., 2017. The problem was that treating everyone is expensive, medically messy, and statistically rude.
Pandya and colleagues re-analyzed 4,651 CANTOS participants and found that people with a high baseline 14-protein signature appeared to benefit most. Their lung cancer risk was nearly halved, and the number needed to treat to prevent one lung cancer case fell to 55. That is not “everyone gets an antibody drug because the algorithm said vibes.” It is closer to cholesterol medicine logic: identify a measurable risk state, then treat the people most likely to benefit.
The Caveats, Because Biology Hates Clean Narratives
This is not a screening test you can order next Tuesday. The signature needs prospective testing, clinical thresholds, cost analysis, equity checks, and a hard look at false positives. A blood test that scares thousands of healthy people for every cancer prevented would be less precision medicine and more anxiety-as-a-service.
Also, blocking inflammation is not free. IL-1β exists for reasons beyond annoying grant reviewers. Immune signaling helps fight infections and repair tissue, so prevention trials will need to prove that benefits outweigh harms in carefully selected groups.
Still, the idea is powerful: lung cancer prevention may work better if we stop asking only “Who has smoked enough?” and start asking “Whose lungs are currently in a tumor-promoting state?” That is a more biological question, and possibly a more humane one.
If this result holds up, the future may look less like waiting for cancer to become visible and more like catching the conditions that let it start. Not glamorous. Very useful. Like fixing the roof before the living room becomes an indoor water feature.
References
Pandya, T., Zagorulya, M., Leung, M. M., Augustine, M., et al. (2026). Plasma signals of lung tumor promotion for molecular cancer prevention. Cell. DOI: 10.1016/j.cell.2026.05.005. PMID: 42242224
Hill, W., Lim, E. L., Weeden, C. E., Lee, C., et al. (2023). Lung adenocarcinoma promotion by air pollutants. Nature, 616, 159-167. DOI: 10.1038/s41586-023-05874-3
Johnson, M. A., Nieves-Rodriguez, S., Hou, L., Huang, B. E., Saadatpour, A., & Doostparast Torshizi, A. (2026). Machine learning-based proteogenomic data modeling identifies circulating plasma biomarkers for early detection of lung cancer. Communications Medicine. DOI: 10.1038/s43856-026-01500-1
Ridker, P. M., MacFadyen, J. G., Thuren, T., Everett, B. M., et al. (2017). Effect of interleukin-1β inhibition with canakinumab on incident lung cancer in patients with atherosclerosis. The Lancet, 390, 1833-1842. DOI: 10.1016/S0140-6736(17)32247-X
Gadd, D. A., et al. (2024). Proteomic signatures improve risk prediction for common and rare diseases. Nature Medicine. DOI: 10.1038/s41591-024-03142-z
Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.