Title: Generative AI-Driven Accelerated Discovery of Passivation Molecules for Perovskite Solar Cells
If you've ever tried to find the one molecule that fixes a specific defect in a perovskite solar cell, you know the pain. You're sifting through a chemical library the size of a phone book, running experiments one at a time, and praying that your intuition about functional groups doesn't waste another three months. Molecular passivation - the strategy of coating perovskite surfaces with molecules that plug defect sites and stop electrons from wandering off - is one of the most effective ways to boost solar cell efficiency. But the discovery process? It's been stuck in trial-and-error purgatory since day one.
A team led by Adroit T. N. Fajar at Kyushu University just kicked the door open with a paper in Advanced Science that essentially says: what if we stopped guessing and let a language model do the chemistry brainstorming for us? (Fajar et al., 2026)
Teaching GPT-2 to Think in Molecules
Here's where it gets wild. The researchers took GPT-2 - yes, the text-generating model that was supposed to write your emails - and fine-tuned it on SMILES strings. SMILES is basically molecular shorthand: a way of writing chemical structures as text, like CC(=O)O for acetic acid. To a language model, molecules are just another language with weird grammar rules about valence and ring closure.
First, they built a discriminative classifier (SMILES-X) trained on literature data to predict whether a given molecule would actually work as a passivator. It hit an F1 score of 0.80 and ROC-AUC of 0.88 - not perfect, but solid enough to separate the promising candidates from the molecular equivalent of background noise.
Then came the generative model. Their fine-tuned GPT-2 churned out over 100,000 novel molecules. Over 80% of them were predicted to be effective passivators. That's not a typo. The model wasn't just randomly assembling atoms like a toddler with LEGO - it had learned which molecular features actually matter for passivation.
From 100,000 Down to 3 (The Filtering Gauntlet)
Generating 100,000 candidates sounds impressive until you realize someone has to actually make these things in a lab. The team applied multi-criteria filtering - synthetic accessibility, drug-likeness scores, novelty checks - and whittled the pool down to roughly 8,000 high-quality candidates. Clustering analysis then identified ten structurally diverse representatives, and three molecules (including a surrogate analog) were selected for experimental validation.
All three worked. Every single one showed a clear passivation effect when tested in inverted perovskite solar cells. The standout performer, 4-maleimidobutyric acid, bumped the average open-circuit voltage from 1.08 V to 1.12 V and pushed power conversion efficiency from 19.3% to 22.2%, while also reducing that annoying hysteresis that makes perovskite cells behave differently depending on which direction you scan the voltage.
For context, perovskite-silicon tandems are now flirting with 35% efficiency (LONGi, 2025), and single-junction perovskites have cracked 27.3%. Every fraction of a percent matters, and passivation is one of the biggest levers researchers have.
Why This Isn't Just Another "AI Does Science" Paper
The AI-for-materials-discovery space is getting crowded. A recent Nature paper demonstrated a fully autonomous closed-loop framework for perovskite cells that discovered its own passivation molecule and hit 27.22% PCE (Nature, 2026). Wu et al. used Bayesian optimization to design hole-transport materials, reaching 26.2% PCE (Wu et al., Science, 2024; DOI: 10.1126/science.ads0901). And earlier ML-assisted screening work showed you could get meaningful predictions from as few as 19 data points (ACS Energy Lett., 2023; DOI: 10.1021/acsenergylett.2c02818).
What makes this paper stand out is the generative piece. Most prior work uses ML to screen existing libraries - basically a smarter search through a known catalog. Fajar et al. are generating entirely new molecules that don't exist in any database, then validating them experimentally. It's the difference between searching Netflix for something to watch and writing your own show.
The three-for-three experimental hit rate is also quietly remarkable. In pharmaceutical drug discovery, AI-generated candidates often have hit rates in the single digits. Getting 100% on your first experimental batch - even if it's a small batch - suggests the discriminative model is doing real work, not just statistical hand-waving.
The Bigger Picture (It's Not Just Solar Cells)
The framework here is generalizable. Anywhere you need molecules with specific functional properties - battery electrolytes, catalysts, polymer additives - the same SMILES-to-GPT pipeline could apply. If you're the kind of person who likes visualizing how all these molecular design strategies connect and branch out, tools like mapb2.io can help you map the relationships between different approaches in this rapidly expanding field.
The honest limitation? The dataset was curated from literature, which means the model inherits whatever biases exist in published research. Molecules that were tested and failed rarely get published, so the training data is skewed toward winners. The team acknowledges this, and it's a known challenge across ML-for-science (Lou et al., Adv. Funct. Mater., 2025).
Still, going from "let me try this molecule because it has an amine group and a hunch" to "let me ask a generative model to explore chemical space I never would have considered" is a genuine shift. The molecules aren't just effective - they're synthetically accessible, which means you can actually buy or make them without a Nobel laureate's lab.
Perovskite solar cells are already closing in on silicon's decades-long head start. With AI playing molecular matchmaker, the gap might close even faster than anyone expected.
References
-
Fajar, A. T. N., Lambard, G., Manopo, J., et al. (2026). Generative AI-Driven Accelerated Discovery of Passivation Molecules for Perovskite Solar Cells. Advanced Science. DOI: 10.1002/advs.202523042. PMID: 41926641.
-
Autonomous closed-loop framework for reproducible perovskite solar cells. (2026). Nature. Link.
-
Wu, F., et al. (2024). Inverse design workflow discovers hole-transport materials tailored for perovskite solar cells. Science. DOI: 10.1126/science.ads0901. PMID: 39666797.
-
Machine-Learning-Assisted Screening of Interface Passivation Materials for Perovskite Solar Cells. (2023). ACS Energy Letters. DOI: 10.1021/acsenergylett.2c02818.
-
Lou, Y., et al. (2025). Prediction and Fine Screening of Small Molecular Passivation Materials via Enhanced ML Workflow. Advanced Functional Materials. DOI: 10.1002/adfm.202511549.
Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.