AIb2.io - AI Research Decoded

The Algorithm Found the Sulfur in the Soup

Swap one herb in a recipe and dinner gets brighter; swap the wrong one and suddenly everyone is politely “not that hungry.” Su and colleagues basically ran that kitchen experiment at molecular scale, except the dish was an artificial oxidase, the seasoning was amino acids, and the chef was an active-learning algorithm with no patience for trying every spice in the cabinet.

The paper, published in Advanced Materials, asks a deceptively simple question: can machine learning help design small, enzyme-inspired catalysts by picking peptide sequences worth testing? According to the authors, they attached Fe(III)-protoporphyrin IX - the iron-porphyrin core related to heme chemistry - onto a lysine residue inside synthetic ten-amino-acid peptides. Then they used hydrogen peroxide as the oxidant and acetophenone as the model substrate, because chemistry papers also need a test kitchen before opening the restaurant.

The Algorithm Found the Sulfur in the Soup

The headline result: after testing 233 peptide variants over 20 active-learning rounds, the system kept circling back to sulfur-containing residues, especially cysteine and methionine, near the iron coordination site. The algorithm started with random inputs. It ended up rediscovering a trick nature already likes.

That is the story. But the numbers tell a more interesting one.

The Search Space Was the Suspect

A decapeptide sounds small until you remember there are 20 common amino acids. Ten positions means a theoretical sequence space of 20^10 possibilities, which is not “let’s screen that after lunch” territory. It is more like asking a barista to taste every possible coffee order in Manhattan, including the crimes involving oat milk and six pumps of mystery syrup.

Active learning tries to avoid that mess. Instead of training a model once and hoping it becomes a tiny oracle, researchers run a loop: test some candidates, train on the results, let the model choose the next candidates, test again, repeat. Wikipedia’s plain-language definition says active learning lets an algorithm query an information source for new labels, which in chemistry means the “oracle” is usually an experiment that costs time, reagents, and someone’s afternoon.

This is why the Su paper matters. It is not just “AI finds catalyst,” the phrase that makes every skeptical chemist reach for coffee. It is a closed-loop workflow where each experimental result changes what gets tested next.

Nature Had Left Fingerprints

The suspicious clue was sulfur.

In natural heme proteins and oxidases, the environment around the metal center can matter as much as the metal itself. Heme is not just an iron atom wearing a fancy porphyrin hat; it is a redox-active coordination complex whose nearby protein residues tune its behavior. Biology does not simply toss iron into a molecule and hope. It builds a neighborhood.

Su and colleagues report that cysteine and methionine improved catalytic activity when placed adjacent to the coordination site. Cysteine contains a thiol group. Methionine contains a thioether. Both bring sulfur, but with different personalities: cysteine is the bolder one at the party, methionine is quieter but still knows where the good snacks are.

The twist is that methionine’s thioether also promoted catalysis, extending the sulfur story beyond the most obvious natural motif. That matters because a simplified synthetic peptide scaffold is not a full enzyme. It is more like an enzyme’s studio apartment: fewer rooms, less furniture, but maybe enough layout to do useful work.

What the Machine Actually Contributed

When pressed, the machine-learning claim here should be judged carefully. The model did not prove a mechanism by itself. It did not become a chemist in a lab coat whispering “try methionine” under fluorescent lights. What it did was prioritize experiments in a large sequence space and expose a pattern that statistical analysis could then interrogate.

That is the right use of active learning: not replacing chemistry, but making the search less wasteful.

Recent work points in the same direction. Yang and colleagues showed active learning-assisted directed evolution could improve an enzyme-catalyzed reaction yield from 12% to 93% in three wet-lab rounds, using uncertainty to balance exploration and exploitation. Suvarna and colleagues used active learning to navigate catalyst composition and reaction conditions for higher alcohol synthesis, reporting a sharp reduction in experiments compared with a huge search space. Schnitzer and colleagues showed that machine-learning workflows can help identify peptide catalysts from large virtual libraries, while also spelling out the limitations. In other words, this field is growing up. It has moved from “what if we sprinkled AI on chemistry?” to “which experiments should we stop wasting?”

The Catch, Because There Is Always a Catch

The careful reading is this: 233 variants is impressive for a wet-lab campaign, but tiny compared with the full peptide universe. The substrate was a model system. The oxidant was hydrogen peroxide. The scaffold was deliberately simplified. Those choices make the study interpretable, but they also mean you should not assume the best motif works everywhere.

The stronger claim is narrower and more useful: active learning helped uncover sequence-position effects in an enzyme-inspired catalyst, and those effects echoed biology while adding a methionine-flavored wrinkle.

If this holds across broader substrates and reaction classes, the impact could be real. Smaller artificial catalysts could become easier to tune than full enzymes, easier to synthesize than complex proteins, and more compatible with automated closed-loop platforms. That could matter for green oxidation chemistry, specialty synthesis, and catalyst discovery workflows where the current method is still partly “graduate student with spreadsheet meets existential dread.”

The paper does not end the investigation. It gives researchers a better map, with sulfur circled in red.

References

  1. Pengkun Su, Yue Zhan, Yuming Su, Zhiye Wang, Suyang Chen, Yibin Jiang, Huihui Hu, and Cheng Wang. “Active Learning Identifies Sulfur-Based Enhancers for Fe(III)-Protoporphyrin Catalysis: Recapitulating Features of Natural Oxidase and Beyond.” Advanced Materials 38, e18756, 2026. DOI: 10.1002/adma.202518756. PMID: 42068197.

  2. Jason Yang, Ravi G. Lal, James C. Bowden, et al. “Active learning-assisted directed evolution.” Nature Communications 16, 714, 2025. DOI: 10.1038/s41467-025-55987-8. PMCID: PMC11739421.

  3. Manu Suvarna, Tangsheng Zou, Sok Ho Chong, et al. “Active learning streamlines development of high performance catalysts for higher alcohol synthesis.” Nature Communications 15, 5844, 2024. DOI: 10.1038/s41467-024-50215-1.

  4. Tobias Schnitzer, Martin Schnurr, Andrew F. Zahrt, Nader Sakhaee, Scott E. Denmark, and Helma Wennemers. “Machine Learning to Develop Peptide Catalysts-Successes, Limitations, and Opportunities.” ACS Central Science 10(2), 367-373, 2024. DOI: 10.1021/acscentsci.3c01284. PMCID: PMC10906243.

  5. Tom H. R. Kuster and Tobias Schnitzer. “Peptide catalysis: Trends and opportunities.” Chem Catalysis 5(5), 101339, 2025. DOI: 10.1016/j.checat.2025.101339.

Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.