AIb2.io - AI Research Decoded

When the AI Finally Watched the Previous Game Tape

I’ll admit it: the part that threw me at first was almost embarrassingly simple. This paper asks whether an AI reading 3D mammograms gets better if you also hand it the patient’s earlier exams, and my first reaction was, well, obviously yes, that’s like giving the quarterback last season’s footage. Then the data jogged onto the field and said, not so fast.

When the AI Finally Watched the Previous Game Tape

The Matchup: AI vs the Double-Reader Defense

The paper looks at digital breast tomosynthesis or DBT, the 3D-style mammography method that slices the breast image into thin layers so radiologists are not stuck squinting through overlapping tissue like they are trying to spot a defender in a pileup. In BreastScreen Norway, these exams are typically read by two radiologists independently, which is the gold-standard team sport this AI had to measure up against [1].

The researchers tested Transpara v2.1 on 30,724 DBT exams from women screened in 2018-2019. For 24,315 women, the AI could also look at prior screening exams. The big question: does historical context help the model call the play better?

Short answer: the AI was strong either way.

Radiologists had a sensitivity of 86.0%. The AI reached an AUC of 0.93 whether prior exams were included or not, which is a very good score for separating likely cancer from not-cancer. If AUC sounds like stats homework trying to ruin your evening, think of it as a measure of how often the model ranks a true cancer case above a non-cancer case. Closer to 1.0 is better. Closer to 0.5 is the machine equivalent of flipping a coin and then acting confident about it [1].

Plot Twist: The Prior Exams Helped, But Not by Much

Here’s the buzzer-beater twist. Prior exams did not meaningfully boost the AI’s overall discrimination. The AUC stayed at 0.93 with or without them. What prior exams did add was a slight bump in specificity, meaning the model got a little better at avoiding false alarms [1].

That matters because screening is full of ugly tradeoffs. Miss a cancer and that is a serious failure. Call too many harmless cases suspicious and you drag a lot of people into extra testing, stress, and callbacks. In other words, this is not a video game leaderboard. It is medicine, where every bad pass has a human on the receiving end.

One especially interesting stat: among exams the AI scored a 10 out of 10, the positive predictive value was 10.2%. That sounds low until you remember screening populations contain far more normal exams than cancer cases. In this setting, a model that can pile suspicious cases into a small high-risk bucket is doing useful triage work, even if that bucket still includes many false positives [1].

Why People in Radiology Are Watching This Like a Playoff Series

This paper lands in a crowded, very active research lane. A 2023 meta-analysis found AI for screening mammography and DBT often showed higher sensitivity but lower specificity than radiologists, while also noting the DBT evidence base was still relatively thin [2]. A 2024 BreastScreen Norway study showed strong AI performance on standard screening mammograms in a real national program, which helps explain why Norway keeps showing up on this scoreboard [3]. Another 2024 reader study found AI assistance improved radiologist AUC, nudged sensitivity upward, and cut reading time from about 54.4 seconds to 48.5 seconds per exam [4]. In radiology, saving six seconds per case does not sound sexy, but across huge screening volumes it is the kind of small edge coaches get fired for ignoring.

There is also growing infrastructure around this space. A 2023 DBTex challenge paper published a benchmark, code, and dataset for lesion detection in DBT, which is a big deal because medical AI loves to promise championships before anyone agrees on the scoreboard [5]. And a 2024 evidence-based review reported six FDA-cleared AI tools for DBT as of July 2024, so this is no longer lab-only fantasy football [6].

What This Paper Actually Changes

The main takeaway is not “AI crushes radiologists.” The paper is more practical than that. It suggests a well-trained AI system can perform about as well as double reading by radiologists in this DBT screening setting, and it does not seem to depend heavily on prior exams to get there [1].

That is useful because prior images are not always available, cleanly linked, or easy to process across sites. If the model still plays solid ball without them, deployment gets simpler. If priors add a little specificity, great, take the extra yardage. But the offense still moves without them.

The bigger challenge now is not whether AI can post a pretty AUC in a retrospective study. It is whether health systems can use these tools prospectively, safely, and fairly, while tracking interval cancers, false positives, workflow changes, and who benefits most. That is the real championship game.

References

  1. Moshina N, Larsen M, Holen ÅS, et al. Artificial Intelligence for Digital Breast Tomosynthesis Screening with and without Prior Examinations in BreastScreen Norway. Radiology: Artificial Intelligence. Preproduction, 2026. doi:https://doi.org/10.1148/ryai.250988. PubMed: https://pubmed.ncbi.nlm.nih.gov/42126307/

  2. Verburg E, et al. Standalone AI for Breast Cancer Detection at Screening Digital Mammography and Digital Breast Tomosynthesis: A Systematic Review and Meta-Analysis. Radiology. 2023;308(2):e222639. doi:https://doi.org/10.1148/radiol.222639. PubMed: https://pubmed.ncbi.nlm.nih.gov/37219445/

  3. Larsen M, et al. Performance of an Artificial Intelligence System for Breast Cancer Detection on Screening Mammograms from BreastScreen Norway. Radiology: Artificial Intelligence. 2024;6(3):e230375. doi:https://doi.org/10.1148/ryai.230375. PubMed: https://pubmed.ncbi.nlm.nih.gov/38597784/

  4. Kim SY, et al. Impact of AI for Digital Breast Tomosynthesis on Breast Cancer Detection and Interpretation Time. Radiology: Artificial Intelligence. 2024;6(3):e230318. doi:https://doi.org/10.1148/ryai.230318. PubMed: https://pubmed.ncbi.nlm.nih.gov/38568095/

  5. Samala RK, et al. A Competition, Benchmark, Code, and Data for Using Artificial Intelligence to Detect Lesions in Digital Breast Tomosynthesis. JAMA Network Open. 2023;6(2):e230524. doi:https://doi.org/10.1001/jamanetworkopen.2023.0524

  6. Lamb LR, Lehman CD, Do S, et al. Artificial Intelligence (AI)-Based Computer-Assisted Detection and Diagnosis for Mammography: An Evidence-Based Review of FDA-Cleared Tools for Screening Digital Breast Tomosynthesis (DBT). AI in Precision Oncology. 2024. doi:https://doi.org/10.1089/aipo.2024.0022. PubMed: https://pubmed.ncbi.nlm.nih.gov/40182614/

Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.