AI speeds up enzyme engineering

Plus: mammalian transposons and more!

Welcome to Plenty of Room!

Today we are jumping back into enzyme engineering, with a machine-learning speed boost!

If you like this newsletter, don’t forget to share with your friends and colleagues: sharing is caring after all!

Plenty of Room is your guide to the cutting-edge news related to molecular machines.

Already subscribed? Share with a friend that might find this interesting! It really helps.

New here? Just go ahead and subscribe!

Let’s get into it now.

AI speeds up enzyme engineering

Researchers combined ML-driven predictions with cell-free protein engineering to dramatically accelerate enzyme optimization. Image credits: resou.osaka-u.ac.jp

Enzymes are the catalysts of the biological word, accelerating reactions and practically allowing life to exist: without them, the metabolic reactions in a cell would be too slow to sustain life. But in the modern world, they also play a crucial role in sustainable chemistry, drug production and biotechnology. Natural enzymes are not optimized for these applications, and scientists often have to modify and fine-tune them to meet their needs. Enzyme engineering relies on site saturation mutagenesis, which swaps amino acids at key positions, and directed evolution, an iterative process of creating random mutations and selecting the best ones. Unfortunately, these methods are slow and constrained by low-throughput screening techniques, exploring only a narrow part of the possible design space and limiting efficient optimization.

So, to address these challenges and develop better enzymes, the authors of today’s paper turned to machine learning (ML). ML has emerged as an amazing tool in protein engineering, predicting the fitness landscape of enzymes and designing optimized proteins. But there still a key challenge: building high-quality training datasets, that accurately capture enzyme function and activity across a wide sequence space. The researchers focused on McbA, an ATP-dependent amide synthetase involved in secondary metabolite biosynthesis. The goal was to engineer McbA into specialized variants capable of catalyzing a diverse set of chemical reactions, including pharmaceutical-relevant amidation reactions. We can divide the new approach in two parts:

  1. Cell-free protein engineering

    A major innovation in this paper is the heavy use of cell-free protein expression systems. Traditionally, enzyme variants have to be expressed inside living cells, which takes days or weeks per iteration and is low throughput, since screening large libraries is slow. Enter cell-free expression: it allows rapid expression and screening, in hours instead of days, a more direct control over reaction conditions and scalability, since multiple enzyme variants can be tested in parallel (also improving the cost efficiency). How does it look in practice? The authors start by rationally selecting residues in the enzyme which are involved in the catalysis, using structural insights, evolutionary trends and computational tools. After that, site saturation mutagenesis and a PCR-based DNA synthesis ensures the variability in the enzyme sequence and each protein is then synthetized using the cell-free systems, and tested in high-throughput functional assays. This approach allowed the researchers to generate data for over 1200 variants of McbA, screening over 10000 reactions! Crazy.

  2. ML-guided enzyme prediction

    After collecting enough data from the cell-free systems, the researchers decided to use machine learning to speed up the engineering of McbA. The idea is to use the single mutant data they obtained before to extrapolate to variants with 2 or more mutation, that will hopefully have better activity. The researchers trained a ridge regression ML model (a model useful for problems with many inputs, like the many mutations of this case) on the experimental data, and they then used it to predict which combinations of mutations would create optimized enzymes!

Okay, how did the workflow perform? The researcher tested it on nine different amide bond formation reactions, prioritizing reactions relevant to pharmaceuticals such as moclobemide, metoclopramide, and cinchocaine. ML-guided enzyme variants showed improvements ranging from 1.6- to a crazy 42-fold improvements in activity over wild-type McbA. In addition, the traditional mutagenesis failed for some of the products, while the ML-guided engineering managed to improve on the wild-type enzyme! And ML-guided enzyme designs consistently outperformed traditional approaches.

This paper was very much focused on speeding up enzyme engineering, using cell-free expression systems and ML-guided engineering, while also improving the success rate. I found it very interesting, and I can see it having broader applications, with some modifications:

  • Biocatalysis: Engineered enzymes could replace chemical catalysts in sustainable pharmaceutical synthesis.

  • Materials Science: New enzyme-based polymerization reactions could be developed.

  • Synthetic Biology: Optimized biocatalysts could be integrated into engineered metabolic pathways​

So don’t miss this cool paper here! And thank you as usual for reading!

In other news:

  • Separating cells with DNA: Scientists separate cells on a daily bases, to study them or for treatments. Traditional methods rely on physical properties (size, density, deformability) or affinity-based techniques (antibodies, magnetic beads), but why not use DNA? This review highlights DNA nanomaterials as a powerful alternative for highly specific and tunable cell separation. It looks very interesting!

  • pH is calling, DNA is answering: Are you missing some DNA computing? Well then, this study presents a pH-controlled DNA switching circuit that dynamically regulates logic operations and hybridization chain reactions (HCRs) across three states. By leveraging triplex DNA structures, it simplifies control over DNA reaction networks, with applications in biosensing, disease detection, and drug delivery

  • Transposing into mammals: I wasn’t sure there were transposons in mammals, and apparently for good reasons: pyggyBat is the only know active mammalian transposon, and it’s not very good at its job. This study investigates it, and why it has such a low transposition activity, which is apparently given by subterminal inhibitory DNA sequences. Pretty cool!

  • Not yet a subscriber to Plenty of Room? Sign up today — it’s free!

  • You think a friend or a colleague might enjoy reading this? Don’t hesitate to share it with them!

  • Have a tip or story idea you want to share? Email me — I’d love to hear from you!

  • You have something you would love me to cover? Just reach out here or on my social!