Ariane Mora is a computational biologist and machine learning researcher specializing in the development of AI-driven methods for enzyme engineering. Ariane Mora’s training bridges molecular biology, AI, and software engineering.
Ariane Mora - Starting Principal Investigator
Ariane Mora = Impact-Driven + Climber + Australian + Open-Source Enthusiast + Collaborative
Ariane Mora is a computational biologist and machine learning researcher specializing in the development of AI-driven methods for enzyme engineering. Her interdisciplinary career integrates synthetic biology, open-source software development, and high-throughput experimentation to uncover and reprogram biological function across diverse systems. As a Schmidt Science Fellow at the California Institute of Technology, her research focused on harnessing machine learning to accelerate enzyme discovery and design for bioremediation. As a PI at AITHYRA, she combines her PhD work using unsupervised models to understand development and disease with her postdoctoral work in enzyme engineering to build an interdisciplinary research group at the interface of ML and biology.
Dr. Mora completed her Ph.D. at The University of Queensland (UQ), Australia, in 2023. Her thesis developed machine learning frameworks to model the dynamics of chromatin state and cell fate transitions, including a research collaboration with the MRC Cancer Unit at the University of Cambridge. She holds a First-Class Honours degree in Electrical and Computer Engineering (B.E., UQ) and completed an exchange at the National University of Singapore. Her training bridges molecular biology, AI, and software engineering.
Orchid number: https://orcid.org/0000-0003-1331-8192
LinkedIn: https://www.linkedin.com/in/ariane-mora-786894b3/
Github: https://github.com/ArianeMora
AI Driven Enzyme Discovery
The research group of Ariane Mora aims to understand how enzymes function and thus how to design and engineer them for maximum health/environmental impact. Specifically 1) Can we predict which enzymes will interact with a new drug or chemical? 2) Can we use this knowledge to accelerate the design of new antibiotics? While motivated by the biomedical question, the team will solve problems through the creation of large datasets, open-source software, ML models, AI agents, and robotics feedback loops.
Functional classification: To build reliable predictive models of enzyme function, systematic, reproducible, and high-quality experimental data are needed. The initial focus will be to extend functional knowledge about natural enzymes, starting with quantifying the promiscuity of enzymes in E.coli. While manually reproducing known reactions (accounting for cofactors, buffers, and other conditions) would be prohibitively laborious, the Mora group will automate the process by combining literature-mining agents, targeted analytical methods, and robotic systems for reaction setups. This approach allows us to test a wide range of enzymes across many substrates and conditions, generating a dataset that includes positive and negative results. Such balanced, high-quality experimental data are currently missing from most databases, these data are critical for training accurate ML models. Once established, this approach can be scaled across systems or adapted for new-to-nature chemistries.
Mechanism elucidation: Building from broad functional classification, the Mora group will extend enzyme ML models to mechanistic prediction. They will begin by collecting data on enzyme classes that rely on well-defined catalytic residues, e.g., where a catalytic dyad or triad residue is essential for activity. By developing a robotics setup they will systematically mutate predicted catalytic residues and test for loss of function across both divergent natural sequences and de novo designs and hence validate mechanistic roles at scale. This platform will be expanded to increasingly complex reaction types in collaboration with biochemists to target chemically diverse mechanisms. With predicted structures and high confidence residue-level functional data, they can use these data to build new generative models that are sensitive to single amino-acid mutations. This work will provide new perspectives in understanding the intricate relationships of amino acid residues within a protein scaffold, enabling scientists to improve the execution of directed evolution strategies and the design of de novo enzymes.
Automated optimization: Directed evolution is a powerful engineering strategy to optimize enzymes, however, it is laborious taking months to years. The group of Ariane Mora will automate this process by using AI agents to autonomously conduct directed evolution campaigns to optimize. In each iteration, agents will integrate tools via APIs, including structure prediction, literature mining, functional annotation, and mechanistic predictions to design the next generation of variants. By leveraging these capabilities agents will not only generate protein designs but also simulate and assess their system-level consequences. Over time, these agents will evolve from optimizing single objectives (e.g., catalytic activity) to handling multi-objective goals (e.g., stability, selectivity, and scalability), allowing directed evolution to address increasingly complex biological contexts, from human health to environmental systems.
Publication Highlights
Squidly (under review, Elife 2025): Developed a ML approach to predict catalytic residues in enzyme sequences.
Rieger, W. J., Boden, M., Arnold, F. H., & Mora, A. (2025). Squidly: Enzyme Catalytic Residue Prediction Harnessing a Biology-Informed Contrastive Learning Framework. bioRxiv, 2025–06. https://doi.org/10.1101/2025.06.13.659624
LevSeq (ACS Synthetic Biology, 2025): Introduced a high-throughput method for generating sequence-function data for ML in enzyme evolution.
Long, Y.*; Mora, A*.; Li, F.-Z.; Gürsoy, E.; Johnston, K. E.; Arnold, F. H. LevSeq: Rapid Generation of Sequence-Function Data for Directed Evolution and Machine Learning. ACS Synth. Biol.2025, 14 (1), 230–238. https://doi.org/10.1021/acssynbio.4c00625.
SiRCle (Genome Medicine, 2024): Developed a clustering model to infer phenotype-regulatory programs in renal cancer.
Mora, A.*; Schmidt, C.*; Balderson, B.; Frezza, C.; Bodén, M. SiRCle (Signature Regulatory Clustering) Model Integration Reveals Mechanisms of Phenotype Regulation in Renal Cancer. Genome Med2024, 16 (1), 144. https://doi.org/10.1186/s13073-024-01415-3.