AIb2.io - AI Research Decoded

RNA, But Make It a Product Roadmap

"Today, generative artificial intelligence (AI) models offer powerful tools for designing RNA sequences."
Sure. And behind that tidy sentence is thirty years of math, biology, and enough probabilistic bookkeeping to make your startup CFO cry.

Marsico's 2026 Nature Reviews Genetics Journal Club is basically a reminder that RNA design did not begin when someone glued "foundation model" onto a slide deck and raised money at a heroic valuation [1]. The real plot starts in 1994, when Sean Eddy and Richard Durbin formalized covariance models for RNA sequence analysis [2]. That sounds niche until you realize it was one of the first serious attempts to treat RNA as a structure-first object rather than a floppy string of letters.

RNA, But Make It a Product Roadmap

That distinction matters. RNA is not just genetic middle management shuttling messages from DNA to proteins. It folds. It loops. It base-pairs with itself. It acts more like molecular origami with a chemistry degree. If you want to design useful RNA - for sensors, switches, aptamers, therapeutics, maybe future mRNA payloads with better behavior - you need sequences that land in the right shape, not just sequences that look plausible on paper.

Covariance models were good at exactly that kind of thinking. They captured how paired positions co-evolve, which is a polite way of saying they noticed when one RNA letter changed and its partner had to change too, like two coworkers covering for each other in a badly run startup. That gave researchers a way to search for structural relatives, not just sequence cousins.

The New Moat: Search the Giant Mess Better

Fast-forward to now, and the RNA design stack looks much more like modern AI. Deep generative models can learn families of RNA sequences, propose new members, and optimize for structure or function. Sumi and colleagues showed this nicely in 2024 with RfamGen, a variational autoencoder that generated RNA family sequences while respecting family-level constraints [3]. The pitch is obvious: instead of handcrafting candidates one by one, train a model on the sequence universe and let it explore the neighborhood at scale.

Other teams pushed on different parts of the flywheel. SamplingDesign reframed RNA design as continuous optimization over distributions of sequences rather than local one-mutation-at-a-time search, which is useful because brute-force sequence search gets ugly fast as RNA length grows [4]. In plain English: stop wandering the maze with a flashlight and start shaping the probability landscape.

Meanwhile, structure prediction got a lot better, which quietly changes the TAM for design. RhoFold+ in Nature Methods showed strong RNA 3D prediction performance at useful speed [5]. And Wong et al. used structural prediction directly for deep generative design of RNA aptamers, connecting model-generated sequences to predicted 3D shapes and then to experimental validation [6]. That is the part where the PDF stops being just vibes and starts looking like an actual product pipeline.

There is also a broader trend here: RNA is getting the language-model treatment. Foundation-style models such as LAMAR aim to learn reusable representations across RNA regulation tasks, not just one narrow benchmark [7]. The promise is not "the model understands biology" - please, let's keep everyone's feet on the floor - but that large-scale pretraining may capture statistical regularities humans would never enumerate by hand.

Why This Is Actually Interesting

The core challenge in RNA design is that the search space is absurd. For an RNA of length n, you are staring at roughly 4^n possible sequences. That is not a haystack. That is a haystack factory with a venture arm.

So the field has always needed compression tricks: thermodynamics, grammars, evolutionary signals, now neural generative models. Marsico's point is that the current AI moment did not replace the old toolbox - it is built on top of it [1]. Covariance models, stochastic grammars, and structural priors were not quaint prehistory. They were the seed round.

That matters for real-world impact. Better RNA design could mean more reliable riboswitches, improved aptamers, smarter regulatory elements, and better-engineered RNA therapeutics. It could also shorten the loop between hypothesis and experiment, which is where a lot of biotech time and money currently go to die. But there is still a large catch, because biology always keeps one in reserve.

The Part Where Biology Lawyer-Reads the Term Sheet

Generative models can produce candidate sequences. They cannot guarantee those sequences fold the same way in messy cellular conditions, survive degradation, avoid off-target effects, or scale into therapies without experimental pain. Reviews of RNA structure prediction and generative molecular design both make the same sober point: better models help, but data scarcity, benchmark quality, interpretability, and wet-lab validation remain hard constraints [8,9].

So no, RNA design is not "solved." It is more like the category just got better infrastructure. Which, to be fair, is how many real businesses get built.

Marsico's essay lands because it cuts through the amnesia of AI hype. The shiny transformer layer is real. But the moat underneath it was dug by decades of people teaching computers that RNA is a structured object with rules, not just alphabet soup in a lab coat. That is less flashy than a demo video. It is also why the field might actually work.

References

  1. Marsico A. RNA design across eras: from covariance models to modern generative AI. Nature Reviews Genetics. 2026. DOI: https://doi.org/10.1038/s41576-026-00967-x
  2. Eddy SR, Durbin R. RNA sequence analysis using covariance models. Nucleic Acids Research. 1994;22(11):2079-2088. DOI: https://doi.org/10.1093/nar/22.11.2079 PMCID: https://pmc.ncbi.nlm.nih.gov/articles/PMC308568/
  3. Sumi S, et al. Deep generative design of RNA family sequences. Nature Methods. 2024;21:435-443. DOI: https://doi.org/10.1038/s41592-023-02148-8
  4. SamplingDesign: RNA design via continuous optimization with coupled variables and Monte-Carlo sampling. Nature Communications. 2025. DOI: https://doi.org/10.1038/s41467-025-67901-3
  5. Accurate RNA 3D structure prediction using a language model-based deep learning approach. Nature Methods. 2024;21:2287-2298. DOI: https://doi.org/10.1038/s41592-024-02487-0
  6. Wong F, He D, et al. Deep generative design of RNA aptamers using structural predictions. Nature Computational Science. 2024. DOI: https://doi.org/10.1038/s43588-024-00720-6
  7. A foundation language model to decipher diverse regulation of RNAs. Genome Biology. 2025;26:301. DOI: https://doi.org/10.1186/s13059-025-03752-x
  8. Deep dive into RNA: a systematic literature review on RNA structure prediction using machine learning methods. Artificial Intelligence Review. 2024. DOI: https://doi.org/10.1007/s10462-024-10910-3
  9. A survey of generative AI for de novo drug design: new frontiers in molecule and protein generation. Briefings in Bioinformatics. 2024;25(4):bbae338. DOI: https://doi.org/10.1093/bib/bbae338

Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.