AIb2.io - AI Research Decoded

Biology Has Been Fighting This Boss Battle Since 1977

Back in 1977, scientists realized genes were not the neat, uninterrupted instruction manuals everyone hoped for. They came in pieces. By 1980, it was clear cells could remix those pieces through alternative splicing, making multiple RNA messages from one gene [1-4]. Since then, researchers have thrown a small army of methods at the problem: RT-PCR, microarrays, short-read RNA-seq, statistical inference, and enough computational guesswork to make a weather app blush. Helpful, yes. Fully satisfying, absolutely not.

Biology Has Been Fighting This Boss Battle Since 1977

That is the setup for Tools and tactics for studying alternative splicing, a 2026 review by Rui Sousa-Luís and Maria Carmo-Fonseca in Nature Reviews Genetics [1]. The paper is basically a status report from a field that spent decades staring at biology through a keyhole and has suddenly found the lightswitch.

One Gene, Many Endings, Mild Panic

Alternative splicing is what lets one gene produce different RNA isoforms by stitching exons together in different combinations. Same ingredients, different recipe, occasionally different fate. It is one reason humans can get so much mileage out of roughly 20,000 protein-coding genes. On one hand, that is elegant. On the other hand, it means a tiny splicing mistake can help drive disease, from rare disorders to cancer [1,5].

The problem was never that scientists lacked ambition. The problem was that the transcriptome is messy in a way that feels personally disrespectful. Short-read sequencing gave us lots of little fragments, but reconstructing full isoforms from those fragments is like trying to infer an entire Netflix series from 14 random screenshots and one spoiler-filled group chat. You get the vibe. You do not get the whole plot.

Long Reads: Finally Reading the Whole Email Thread

The big technical leap here is long-read sequencing. Instead of chopping RNA into many short pieces and reconstructing the transcript afterward, long-read methods can capture much longer stretches, often the full isoform. That means researchers can directly see which exons travel together, in which cells, and sometimes where in tissue they show up [1,6]. The ENCODE4 long-read collection helped map transcript diversity at scale, and newer work continues to show how much isoform variation we were missing, especially in complex tissues like the brain [6,7].

This matters because splicing is not some decorative flourish on top of gene expression. It changes what proteins get made, whether RNAs are degraded, and how cells specialize. If gene expression tells you which songs are on the playlist, splicing tells you which remix the cell actually plays. Sometimes it is the radio edit. Sometimes it is the 11-minute techno version nobody asked for.

CRISPR Enters With a Crowbar

Seeing isoforms is one thing. Proving they matter is another. That is where CRISPR-based assays come in. The review highlights how genome editing can now test the function of specific splice isoforms directly, rather than just waving at correlations and hoping reviewers feel generous [1]. Related work on programmable splicing manipulation shows how antisense oligonucleotides, base editing, and CRISPR-linked systems are turning splicing from something we observe into something we can perturb on purpose [8].

This is the part where the wonder-dread meter starts flickering. On one hand, personalized therapies that correct harmful splicing patterns sound incredible. On the other hand, we are now editing the already-chaotic layer of biology that decides which message gets sent in the first place. The cell was barely keeping its paperwork straight before we showed up with gene-editing tools and a can-do attitude.

AI Tries to Decode the Splicing Weirdness

The AI angle is real, and the paper does not oversell it. Deep learning models such as SpliceAI and newer transformer-based approaches are getting better at predicting splice sites and variant effects from sequence context alone [1,9,10]. That is useful because splicing depends on signals spread across long stretches of sequence, plus regulatory motifs, plus cell context, plus biology’s ongoing commitment to being complicated.

Recent reviews argue that the field is moving from black-box prediction toward models that connect sequence patterns to actual mechanisms [5,10]. In plain English, the goal is not just "the model guessed right," but "the model guessed right for reasons a molecular biologist would not throw across the room."

If that works, the payoff is huge. Better interpretation of disease variants. Better diagnostics. Better target selection for therapies. And maybe fewer situations where a clinician gets a genome report that basically says, "Something weird is happening near a splice site. Best of luck."

Why This Paper Sticks

What makes this review worth your time is that it captures a real transition. Alternative splicing research is moving from indirect measurement to isoform-resolved observation, from association to perturbation, and from hand-built rules to AI-assisted decoding [1]. It is not a clean victory lap. Long-read methods still have cost, throughput, and analysis challenges. Functional validation remains hard. Models still miss context. But the field finally has tools that can ask better questions instead of just making prettier guesses.

Which is comforting and unsettling in equal measure. We are getting better at reading the cell’s internal editing process. That could help explain disease with much more precision. It also means biology keeps revealing that the tidy textbook version was the demo mode.

References

  1. Sousa-Luís R, Carmo-Fonseca M. Tools and tactics for studying alternative splicing. Nature Reviews Genetics. 2026. DOI: https://doi.org/10.1038/s41576-026-00952-4
  2. Chow LT, Gelinas RE, Broker TR, Roberts RJ. An amazing sequence arrangement at the 5′ ends of adenovirus 2 messenger RNA. Cell. 1977. DOI: https://doi.org/10.1016/0092-8674(77)90180-5
  3. Berget SM, Moore C, Sharp PA. Spliced segments at the 5′ terminus of adenovirus 2 late mRNA. PNAS. 1977. DOI: https://doi.org/10.1073/pnas.74.8.3171
  4. Early P, Rogers J, Davis M, et al. Two mRNAs can be produced from a single immunoglobulin mu gene by alternative RNA processing pathways. Cell. 1980. DOI: https://doi.org/10.1016/0092-8674(80)90374-9
  5. Rowlands A, Baralle D, Gazzara MR. From computational models of the splicing code to regulatory mechanisms and therapeutic implications. Nature Reviews Genetics. 2024. DOI: https://doi.org/10.1038/s41576-024-00774-2
  6. Reese F, et al. The ENCODE4 long-read RNA-seq collection reveals distinct classes of transcript structure diversity. Nature. 2023. PubMed: https://pubmed.ncbi.nlm.nih.gov/37292896/
  7. Gandal MJ, et al. Developmental isoform diversity in the human neocortex. Science. 2024;384(6698):eadh7688. DOI: https://doi.org/10.1126/science.adh7688
  8. Adamson SI, et al. Strategies for programmable manipulation of alternative splicing. Current Opinion in Genetics & Development. 2024. PubMed: https://pubmed.ncbi.nlm.nih.gov/39471777/
  9. Nguyen TH, et al. Transformers significantly improve splice site prediction. Communications Biology. 2024. DOI: https://doi.org/10.1038/s42003-024-07298-9
  10. Gupta A, et al. Improved modeling of RNA-binding protein motifs in an interpretable neural model of RNA splicing. Genome Biology. 2024. DOI: https://doi.org/10.1186/s13059-023-03162-x

Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.