• Plenty of Room
  • Posts
  • AI RNA Design: Generating Functional Molecules with Machine Learning!

AI RNA Design: Generating Functional Molecules with Machine Learning!

AI models create functional RNA that actually works

AI-based protein design is one of my favorite topics. But I do have a soft spot for nucleic acids in my heart. The authors of today’s paper developed a AI-model to design new functional RNA molecules, from CRISPR guides to toehold switches!

Do you want to support Plenty of Room? Share this issue!

📰 Science 🤝 Newsletters: As you might have noticed, I like newsletters! And I think there is a lot of space for thoughtful content creators helping scientists stay updated in their field. So I was excited when I found Organic Synthesis! Every week, Steven Bennet delivers short summaries of the latest literature in synthetic organic chemistry. It’s definitely worth it: I love how he structures his content!

Designing RNA with AI

AI-based RNA design could create new drugs, sensors and synthetic systems. Image credits: Synbiobeta.

RNA: The Versatile Companion to DNA and proteins

This is another issue in my effort in writing more about RNA!

RNA molecules are extremely versatile. They are no longer seen just like passive messengers of genetic information. Instead, their use nowadays ranges from therapeutics and gene regulation, to nanostructure creation.

RNA is central to modern biotechnology, thanks to this capacity to both encode information and perform functions. Think of toehold switches, aptamers or guide RNAs. The tools let us sense molecular signals, regulate gene expression, and target nucleic acids.

At the same time, the design of functional RNA remains a challenge.

Why Is RNA Design So Hard?

Despite the simplicity of its building blocks (the 4 bases), RNA design is surprisingly tricky. That’s because:

  1. Function depends on both sequence and structure

    RNA’s biological activity follows not only the nucleotide sequence, but also the RNA structure, depending on base pairing and folding dynamics.

  2. Thermodynamic models are limited

    Classical tools like your your NUPACK or ViennaRNA rely on energy models to predict folding. And they are good at that! But they struggle to generalize to different RNA classes, with complex RNA or, simply, with limited data.

  3. Experimental design is resource-intensive

    To supplement these models, and fix some of their issues, often researchers have to use large-scale screening: and, as you know, these are slow and expensive!

Merging RNA with Deep Learning

To tackle this design problem, the author’s of today’s paper built a deep learning framework that can:

  1. Predict the function of structures RNAs using both their sequence and their secondary structure.

  2. Generate novel RNA sequences with optimized sequences for specific functions!

But let’s dig deeper into how they did it! It’s quite cool.

SANDSTORM: Predicting RNA Functions

First of all, super cool name.

SANDSTORM stands for Sequence AND STructure Of Rna Molecules, and it’s a dual-input convolutional neural network. This model takes two separate types of input, processes each one with its own neural network branches, and then merges the info together for prediction!

In this case, one input is the RNA sequence, turned into a number encoding. The second one is the RNA structure, represented as a matrix of base-pairing. This architecture lets the network learn how sequence and structural context shape RNA behavior.

This new model has a few strengths:

  • Lightweight: It has fewer trainable parameters, which results in faster training, lower overfitting and ease of use on low-powered hardware

  • Generalizability: SANDSTORM works across RNA types: toehold switches, 5’ UTR, ribosome binding sites, CRISPR guide RNAs.

  • Interpretability: The model learns interpretable structural motifs, something that sequence-only models struggle to do!

But function prediction was not their goal: they wanted to design new RNA!

GARDN: Generating RNA with AI

So, the team created GARDN: the Generative Adversarial RNA Design Network. GARDN has a less cool name than SANDSTORM, but it compensates in concept. GARDN is a GAN-based model, capable of producing realistic RNA sequences across different functions and design constraints. But, what is a GAN?

What Is a Generative Adversarial Network (GAN)?

Yes, GAN stands for generative adversarial network. This deep learning architecture is designed to generate realistic, high-quality synthetic data. And I think it’s one of the coolest concept in machine learning.

A GAN consists of two neural networks that compete against each other, like in a game:

  1. Generator: This network creates fake data.

  2. Discriminator: This second one tries to tell real sequences from fake ones.

So, the generator proposes new RNA sequences, and the feedback from the discriminator (or from SANDSTORM) guides it towards better designs. Over time, both networks get better. The result is RNA sequence that are structurally realistic and functionally optimized!

GANs are a powerful tool when exploring complex design spaces with many constraints and hidden rules, like RNAs! And GARDN worked quite well. It outperformed traditional design tools and it didn’t need post-hoc fixes, showing consistency!

Experimental Validation

I like papers that have a computational part and an experimental validation. It’s just satisfying. The team tested their new models in 2 experimental scenarios, with the second one being the most interesting!

Toehold Switches

Toehold switches are synthetic RNA-based gene regulators, and they work like programmable “off/on” switches for translation:

  • OFF state: the RNA folds and hides the ribosome binding site (RBS) and start codon, blocking the translation

  • ON state: when a trigger RNA binds to a specific region, the RNA unfolds, exposing the RBS and start codon and turning the translation on

They are used in synthetic biology and diagnostics to detect specific RNA sequences, for example viral RNA.

Using GARDN, the team generated new toehold switches and tested them in E. coli. Some showed up to 11× higher ON/OFF ratios than designs from classical tools.

When GARDN and SANDSTORM were combined, they created switches that were 28× better than traditional methods, and even outperformed the best sequences in the training dataset!

Aptaswitches

Aptaswitches are similar to toehold switches. They have a RNA hairpin coupled to an aptamer sequence, with the hairpin sequence complementary to a target DNA. When the target binds, aptamer changes shape and activate fluorescence!

In this case, the researchers were working with a small dataset, only 384 training examples. They created aptaswitches to detect a gene of SARS-CoV-2. the combined GARDN-SANDSTORM model identified aptaswitches with ON/OFF ratios up to 161x! tis shows that the model generalized well even when the data are scarce, and it still outperforms thermodynamic models in speed and accuracy.

Conclusion

Well, this was a dense paper! The authors spent a lot of time on this project, it’s clear. And as I said, I always enjoy a paper that has experimental validation to its computational sides.

I think the strongest parts of the new models are

  • Dual input network: I was surprised to learn that not many models integrated structural data. I guess it makes sense, there is still a lot of basic research on RNA structures.

  • Generalizability: This is cool, because it makes easier to use a single model, instead of a thousand of them.

  • Low-data compatibility: The network works well with small datasets, very important for use in biology, where often data is restricted (think emerging pathogens or novel RNAs).

Tools like this can open the door to more efficient and scalable RNA design in:

  • Synthetic biology: To create improved gene regulators, biosensors, and more.

  • Diagnostics: For rapid development of tests for infectious diseases

  • Therapeutics: Of course, we could create better RNA sequences for gene therapy or other RNA drugs.

So, go and read the paper for yourself here! I liked it.

If you made it this far, thank you! What do you think about AI-based RNA design? Have you read something similar and I missed it? Reply and let me know!

P.S: Know someone that will like this? Share it with them! They will thank you.

More Room:

  • Gold And Protein: A Match in the Lab. I kinda like gold nanoparticles: they are useful! And what about combining with proteins? This review highlights how protein templates enable precise assembly of gold nanoparticles (AuNPs) into 1D, 2D, and 3D structures, enhancing their plasmonic, catalytic, and mechanical properties. It covers recent strategies, interaction types, and applications in sensing, energy, and biomedicine, while noting key challenges like scalability and stability. Cool!

  • Metal DNA Crystals: Maybe tomorrow’s electronics will be made from DNA, maybe not. In the meantime this study introduces a method to tune DNA’s electronic properties using metal-mediated base pairs (mmDNA). By exchanging Ag⁺ and Hg²⁺ in DNA crystals through pH changes, the researchers demonstrate environmentally controlled transmetalation that alters DNA’s band structure and conductance. Time-resolved X-ray diffraction and computational analysis reveal significant shifts in LUMO conductance, laying the groundwork for rewritable memory and DNA-based nanoelectronics.

  • Second Life for Defective DNA Origami: Na not really. But this study is pretty cool. It examines how structural defects affect site accessibility on DNA origami, using a six-helix bundle (6HB) structure and variants with missing staple strands. Using super-resolution microscopy and molecular dynamics simulations, the researchers show that addressable sites remain robustly accessible despite these defects. The findings offer important design insights for using DNA origami in nanomedicine and nanophotonics.

  • Know someone who’d love this?
    Pass it on! Sharing is the easiest way to support the newsletter and spark new ideas in your circle.

  • Got a tip, paper, or topic you want me to cover?
    I’d love to hear from you! Just reply to this email or reach out on social.