The Problem With Eyeballing Pre-Cancer

This is not a tumor detector. It's not a lung cancer screener. It's not another "AI reads X-rays" headline. And it definitely doesn't replace your pathologist.

What it is might be more interesting: a transformer-based system that looks at tissue slides and gene expression data from your airways and tries to figure out whether those weird-looking cells are on a slow march toward lung cancer - or just sitting there, minding their own business.

Deep inside your lungs, the bronchial lining sometimes starts... misbehaving. Cells go from normal to hyperplasia to metaplasia to dysplasia, shuffling through a spectrum of increasingly ominous-sounding stages like a corporate employee climbing a ladder nobody asked them to climb. At the end of that ladder? Lung squamous cell carcinoma.

The tricky part: pathologists looking at these premalignant lesions (PMLs) under a microscope often disagree about what they're seeing. One study's "moderate dysplasia" is another's "eh, maybe metaplasia." The grading is subjective, varies by subspecialty training, and the whole spectrum is broad enough that reproducibility across institutions is, let's say, aspirational (Beane et al., 2019). It's like asking ten sommeliers to rate the same wine - you'll get ten different answers and eleven arguments.

Meanwhile, some of these lesions will progress to invasive cancer. Others will regress on their own, like a New Year's resolution that quietly fades by February. We currently lack reliable tools to tell the difference, which means patients either get over-monitored or under-treated.

Enter the Transformer (No, Not Optimus Prime)

Xu, Beane, Kolachalama, and colleagues built an attention-based deep learning framework that fuses two types of data: whole slide images (WSIs) of H&E-stained biopsies and gene expression (GE) profiles from the same tissue (Xu et al., 2026, DOI: 10.1186/s13073-026-01636-8). Think of it as giving the AI both the photograph and the ingredient list for the same dish.

The model uses a transformer architecture - the same family of algorithms behind large language models - but instead of predicting the next word, it's learning which patches of a tissue slide and which gene expression patterns matter most for distinguishing dysplasia-or-worse from the benign stuff (normal, hyperplasia, metaplasia).

Here's where it gets clever: the system is flexible. It can work with slides alone, gene expression alone, or both together. This matters because in a real clinical setting, you might have a biopsy but no transcriptomic data, or vice versa. The model doesn't throw a tantrum when one input is missing - it just works with what it's got, like a chef who can still cook when the grocery delivery is incomplete.

The Numbers (For the Skeptics in the Back)

Trained across four separate studies - that's important, because single-dataset models are the AI equivalent of someone who's only ever eaten at one restaurant - the flexible fusion model hit an AUROC of 0.809 on external whole slide images and 0.903 on external gene expression data. For context, combining both data types consistently outperformed either alone: WSI+GE scored 0.761 versus 0.690 for WSI-only on external pathology images.

The real kicker? Despite being trained on a simple binary label (dysplasia-or-worse vs. not), the model's probability scores actually tracked with histologic grade. Higher probability meant higher-grade lesion. It essentially learned to see the spectrum without being explicitly taught it existed - like a kid who figures out the spiciness scale after only being told "hot" and "not hot."

The model also identified specific gene expression alterations tied to bronchial dysplasia across multiple independent datasets, pointing to biologically meaningful patterns rather than just statistical noise.

Why This Matters Beyond the Lab

Lung cancer remains the leading cause of cancer death worldwide, and squamous cell carcinoma accounts for roughly 25-30% of cases. Catching it at the premalignant stage is the dream, but only if you can reliably identify which lesions actually need intervention. Previous work using graph-based neural networks showed promise in stratifying PMLs from histopathology alone (Gindra et al., 2024, DOI: 10.1016/j.ajpath.2024.04.001), and the broader field of multimodal fusion - combining pathology images with genomic data - has been gaining momentum across oncology (Chen et al., 2022, DOI: 10.1109/TMI.2020.3021387).

What sets this work apart is the practical flexibility. Multimodal fusion sounds great in a review paper, but it's often impractical because you rarely have all data types for every patient. Building a model that degrades gracefully when data is missing? That's the kind of engineering decision that separates research prototypes from something a hospital might actually use.

If you're the type who likes mapping out how complex systems connect - how tissue morphology links to gene expression links to disease progression - tools like mapb2.io can help visualize those kinds of multi-layered relationships outside the lab, too.

The Honest Caveats

This isn't ready for your doctor's office tomorrow. The AUROCs are solid but not perfect, and external validation across even more diverse populations will be needed. The gene expression component requires transcriptomic profiling, which isn't exactly a routine clinical test (yet). And as with all attention-based models, interpretability remains a work in progress - the model can tell you what it's paying attention to, but the "why" still requires human pathologists to weigh in.

Still, mapping premalignant lesions onto a continuous disease spectrum using flexible, multi-modal AI? That's a step toward catching lung cancer before it becomes lung cancer. And given that early detection is still our best weapon, that quiet step might matter more than any headline-grabbing breakthrough.

References

Xu, L., Kefella, Y., Zhang, Y., Conrad, R.D., Anderson, K.E., Krysan, K., Liu, G., Kane, E., Pennycuick, A., Merrick, D.T., Janes, S.M., Reid, M.E., Burks, E.J., Billatos, E., Mazzilli, S.A., Kolachalama, V.B., & Beane, J.E. (2026). Attention-based deep learning for analysis of pathology images and gene expression data in lung squamous premalignant lesions. Genome Medicine. DOI: 10.1186/s13073-026-01636-8 | PubMed: 41952176
Gindra, R.H., Zheng, Y., Beane, J.E., et al. (2024). Graph Perceiver Network for Lung Tumor and Bronchial Premalignant Lesion Stratification from Histopathology. The American Journal of Pathology, 194(7), 1251-1263. DOI: 10.1016/j.ajpath.2024.04.001 | PMC11220922
Beane, J.E., Mazzilli, S.A., et al. (2019). Molecular subtyping reveals immune alterations associated with progression of bronchial premalignant lesions. Nature Communications, 10, 1856. DOI: 10.1038/s41467-019-09834-2
Chen, R.J., Lu, M.Y., Williamson, D.F.K., et al. (2022). Pathomic Fusion: An Integrated Framework for Fusing Histopathology and Genomic Features for Cancer Diagnosis and Prognosis. IEEE Transactions on Medical Imaging, 41(4), 757-770. DOI: 10.1109/TMI.2020.3021387 | PMC10339462
Davri, A., Birbas, E., Kanavos, T., et al. (2023). Deep Learning for Lung Cancer Diagnosis, Prognosis and Prediction Using Histological and Cytological Images: A Systematic Review. Cancers, 15(15), 3981. DOI: 10.3390/cancers15153981 | PMC10417369

Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.

AIb2.io - AI Research Decoded

The Problem With Eyeballing Pre-Cancer

Enter the Transformer (No, Not Optimus Prime)

The Numbers (For the Skeptics in the Back)

Why This Matters Beyond the Lab

The Honest Caveats