- Plenty of Room
- Posts
- AI-Designed Lipid Nanoparticles: Machine Learning Boosts RNA Medicine!
AI-Designed Lipid Nanoparticles: Machine Learning Boosts RNA Medicine!
Machine learning steps in to craft better lipid carriers for RNA medicines
The COVID vaccine changed the game for RNA drugs, showing their power to the world. But their success is built on lipid nanoparticles: can machine learning improve their design? Let’s find out today!
Don’t keep this newsletter a secret: Forward it to a friend today!
Was this email forwarded to you? Subscribe here!
Lipids Go Digital

Researchers developed COMET, a machine learning model capable of predicting the transfection efficacy of existing lipid nanoparticles and creating new, improved formulations. Image credits: Nature.
RNA Drugs: A Success Built on Lipids
RNA drugs have exploded in the last few years.
The first successes go back to the 2010s, but the COVID-19 vaccines truly supercharged their fame. And for good reasons! They proved they worked. After the pandemic, lots of RNA drugs are going through clinical trials for cancer, infections, and rare diseases.
But RNA doesn’t come without limitations. Especially messenger RNA (mRNA, the one used for vaccines): it’s large, heavy, and it degrades easily. The success of RNA drugs relied heavily on lipid nanoparticles (LNPs).
These nanoscale lipid particles can encapsulate the RNA, protecting it from enzymes and avoiding degradation. Plus, LNPs help with transfection, the process of inserting RNA into the cell, so it can be expressed.
There is a whole zoo of LNPs! You can have different lipids, different ratios between them, or even add polymers, DNA, or proteins (for targeting). These details change the function of the LNPs: which cells it targets, how efficiently it delivers RNA, and how stable it is.
So, it would be great to design LNPs that target specific organs or cell types!
Just one problem: the formulation space (the possible combinations of lipids, their ratios, and other parameters) is huge. Using experiments to optimize the formulation of LNPs is slow, expensive, and the results are incomplete.
So, there is a need for better ways!
Machine Learning Meets Lipids
And here is where today’s paper comes in!
The authors created 2 key tools, bringing together machine learning and lipid chemistry:
LANCE: A large, high-throughput experimental dataset of LNP formulations and their measured transfection efficacy.
COMET: A transformer-based model that represents the entire LNP formulations (molecular structures + compositions + other variables) and predicts transfection efficacy end-to-end.
Together, they aim to accelerate the discovery of high-performing LNPs!
LANCE: The Dataset
The team first built LANCE, the biggest formulation dataset ever.
It contains over 3,000 unique LNPs, each evaluated for its transfection efficacy in two mouse cell lines (DC2.4 and B16-F10, for the curious). The data were generated using automated mixing and luciferase mRNA, with bioluminescence as a readout!
The dataset captured the main design variables for LNPs:
Lipid identity: If it’s ionizable lipids, helper lipids, sterols, or PEG lipids.
Molar ratios: They used 13 different commonly used ratios.
Process parameters: Like aqueous/organic phase, nitrogen/phosphate, and lipid:mRNA weight ratios.
The library spans broad chemistries and mixing conditions to reveal context-dependent effects and synergies. Often, transfection efficacy is a multifactorial affair! And it’s hard to know what will influence it.
For example, some formulations were highly effective in transfecting DC2.4, but not B16F10, and vice versa. But some were great at transfecting either! Go figure.
COMET: A Performing Model
Well, I guess that is the model’s job.
The team developed COMET (a great name short for Composite Material Transformer), a transformer-based model capable of representing different aspects of LNPs as tokens:
Molecular embeddings for the structure of the lipids.
Composition encodings represent the molar percentage for each component.
Formulation tokens represent additional formulation features, such as aqueous/organic phase or lipid:mRNA weight ratios.
The model learns to connect LNP chemistry and formulation to translation efficacy!
And how did it perform?
COMET achieve high correlation between predicted and measured efficacy, showing strong performance! In addition, it correctly ranked unseen “top-performing” formulations, with high reliability!
From Predictions to Discoveries
Now for the coolest part.
Using COMET, the team screened a virtual library of nearly 50 million LNP formulations!! They wanted to discover new, effective formulations beyond the ones present in LANCE.
From these, they selected and tested a small subset of top-ranked candidates. The results? Most of these outperformed clinically approved lipids in transfection efficacy!
And when tested on existing formulations, COMET was able to improve them, with 5/6 optimized formulations outperforming the starting ones.
Beyond Lipids: Generalizability and in Vivo Validation
The team tested whether COMET could generalize its learnings to other tasks. They focused on 3:
Polymers: The model was retrained on a small polymer dataset (around 450 samples), and COMET identified high-performing hybrid polymer-LNPs!
New cell lines and payloads: COMET could be adapted to new cell lines or payloads (interleukin instead of luciferase), with just a bit more data.
Lyophilization stability: LNPs lose efficacy after lyophilization (useful for storage at room temperature). COMET, after training on a small dataset, was able to predict post-lyophilization efficacy loss.
But they didn’t stop here.
The team tested some of their LNP formulations in vivo. The LNPs were injected into mice, delivering mRNA for fluorescein.
Some of the LNPs discovered by COMET showed 40x higher luminescence than one of the clinical benchmarks, and 5x higher than the other!
But some others didn’t do as well. The authors highlight that in vitro success doesn’t translate perfectly in vivo, since cell lines don’t capture the whole response!
From ML to LNPs
Cool work!
Also, a well-written paper, clear even to an ML noob like me. I think that ML shines in situations where interactions between components are non-intuitive and hard to predict.
COMET and LANCE together provide a practical and powerful tool for formulation design. They enable the rapid screening of candidates and reduce the need for experimental validation. Like with ML-based protein design, the aim is to run fewer experiments, at least for now!
I feel like the main limitation is the limited correlation between in vitro and in vivo efficacy, but that’s a general problem! With time, I’m sure COMET will get even more powerful.
So, go and read all the details here!
If you made it this far, thank you! What do you think of RNA drugs? Have you worked with LNPs? Do you think other systems are more promising? Reply and let me know!
P.S: Know someone interested in drug delivery? Share this with them!
More Room:
DNA Origami + Oligolisine = Love. DNA origami struggles with stability in biological conditions, and oligolisine has always been there to help. This study presents PEG-grafted oligolysine coatings that protect DNA nanostructures while preserving receptor binding. Screening 36 formulations revealed coatings that enhanced stability 30-fold and increased binding specificity 12-fold compared to standard materials. Optimal designs required coordinated tuning of multiple parameters, offering a strategy to create biostable, receptor-targeted DNA nanodevices for biomedical applications.
Controlling Plasmonics with DNA: I’m slowly learning more about plasmonics and DNA origami, so you should too. This work introduces a high-throughput time-resolved circular differential scattering method to monitor the real-time dynamics of single chiral plasmonic metamolecules during DNA-driven conformational switching between enantiomeric states. By tracking hundreds of individual structures simultaneously, the team measured a transition path time of 123.7 ms and showed that longer DNA linkers (8–14 nucleotides) increase structural stability. This work provides new insights into engineering and controlling dynamic plasmonic nanostructures for smart, reconfigurable systems.
Targeting bacteria with DNA Origami: Yes, full DNA origami today. This study presents a new strategy for targeting bacteria with DNA origami nanostructures (DONs) by modifying them with the antibiotic vancomycin, instead of traditional aptamers. Using click chemistry, vancomycin was attached to DON triangles, enabling specific binding to both Gram-positive (B. subtilis, S. capitis) and even Gram-negative (E. coli) bacteria, without showing antimicrobial activity. DONs modified on both sides bound more strongly to some species, highlighting the role of spatial design. This approach showcases small-molecule–based targeting as a robust, broad-spectrum alternative for antimicrobial DNA nanodevices and infection treatment.
Share Plenty of Room with founders or builders
I help biotech and deep tech companies transform complex technologies into engaging content that builds credibility with investors, partners, and potential hires. Let’s chat!
Know someone who’d love this?
Pass it on! Sharing is the easiest way to support the newsletter and spark new ideas in your circle.Got a tip, paper, or topic you want me to cover?
I’d love to hear from you! Just reply to this email or reach out on social.