ICML 2026 Workshop

3rd AI for Math Workshop
Toward Self-Evolving Scientific Agents

July 11, 2026 · 8:00 a.m. to 5:00 p.m. Room HALL D1, COEX Convention & Exhibition Center, Seoul, South Korea

News & Announcements

  • May 26, 2026 Paper submission deadline has passed. We received 346 submissions (double last year's volume). Review process is now underway.
  • May 2, 2026 Received official confirmation from ICML 2026, our workshop will be held on July 11, 2026.
  • May 1, 2026 Challenge tracks are now live. Join the active Codabench competitions and start submitting your results.
  • April 3, 2026 We are recruiting reviewers. Fill out the reviewer application form!
  • April 1, 2026 Paper submission portal opens on OpenReview.
  • March 31, 2026 We are actively seeking sponsorship. See more.
  • March 21, 2026 Our workshop has been accepted at ICML 2026! See you in July.

About the Workshop

Mathematics has long served as a foundation for scientific discovery and a benchmark for reasoning systems. Recent advances in LLMs and formal methods have enabled AI agents to achieve IMO-level performance in theorem proving and demonstrate strong capabilities in end-to-end natural language mathematical reasoning.

Why This Workshop

We focus on the next generation of automated research agents that reason across mathematics and adjacent scientific domains.

Core Goal

Our central question is how such systems can achieve self-evolution while remaining verifiable, reliable, and scientifically useful.

We invite contributions from academia and industry across the following themes:

  • Formal theorem proving: How can LLM theorem provers transcend Olympiad questions to support real-world mathematics research and teaching, and self-evolve to propose and solve innovative conjectures?
  • Precise autoformalization: How to close the gap between formal and informal mathematical reasoning? How can natural language mathematics be reliably translated into formal languages?
  • Automated mathematics in natural language: How to achieve frontier mathematical reasoning performances with a pure natural language pipeline, including data, generation, and verification?
  • Scientific problem solving: How do the advances of mathematical reasoning benefit broader scientific fields, e.g., theoretical computer science and physics?
  • Multimodal reasoning: How do current reasoning systems use visual information? How can we develop methods to tackle problems in multimodal mathematical and scientific reasoning?
Also welcome:

verification and measurement, human-AI collaboration, and scientific agents for systems science, causality, finance, bioinformatics, and related areas.

Key Dates

  • Challenge Open May 1, 2026
  • Challenge Deadline June 15, 2026
  • Paper Submission Open April 1, 2026
  • Paper Submission Deadline May 25, 2026
  • Paper Decision Notification June 15, 2026
  • Camera-ready Deadline June 25, 2026
  • Workshop Date July 11, 2026

Workshop History

Invited Speakers & Panelists

Jia Li

Jia Li

Project Numina

Jia Li is the Founder of Project Numina, an open-source organization dedicated to advancing AI for mathematics. He previously served as a researcher at Mistral AI and was the Founder and CTO of Cardiologs, an AI-driven medical diagnostics company founded in 2014 and acquired by Philips in 2022. At Project Numina, he leads the development of large-scale language models for mathematical reasoning and formal theorem proving, including Kimina-Prover, the first large language model designed specifically for formal reasoning in proof assistant environments. Numina's models won the first AIMO Progress Prize awarded by the AIMO Committee. His work focuses on building open, verifiable AI systems for formal mathematics and scientific discovery.

Dawn Song

Dawn Song

UC Berkeley

Dawn Song is a Professor in Computer Science at UC Berkeley and Co-Director of Berkeley Center for Responsible Decentralized Intelligence. Her research interests lie in AI safety and security, Agentic AI, deep learning, and security and privacy. She is the recipient of the MacArthur Fellowship and the Guggenheim Fellowship, and has been recognized as Most Influential Scholar in computer security. She is an ACM Fellow and IEEE Fellow.

Leonardo de Moura

Leonardo de Moura

AWS / Lean FRO

Leonardo is a Senior Principal Applied Scientist in the Automated Reasoning Group at AWS and the Chief Architect of the Lean FRO, which he co-founded with Sebastian Ullrich. Before AWS, he spent 17 years as a Senior Principal Researcher in the RiSE group at Microsoft Research, and earlier was a Computer Scientist at SRI International. His work spans automated reasoning, theorem proving, decision procedures, and SAT and SMT, and he is the main architect of Lean, Z3, Yices 1.0, and SAL. His contributions have been recognized with the CAV, Haifa, and Herbrand awards and the ACM Programming Languages Software Award.

Emily First

Emily First

Rutgers University

Emily First is an Assistant Professor in Computer Science at Rutgers University. Her research focuses on AI for software engineering and programming languages, particularly leveraging AI for theorem proving. She works on creating tools to automatically generate proofs and lemmas in proof assistant languages such as Rocq, Isabelle/HOL, and Lean, and designs effective interfaces for human-AI interaction. She completed her PhD at UMass Amherst and was a postdoctoral researcher at UC San Diego.

Zheng Yuan

Zheng Yuan

ByteDance Seed

Zheng Yuan is a researcher in ByteDance Seed. He leads the development of Seed Prover, which proves 5/6 problems on IMO 2025. His research interest is training long-horizon agents for formal and informal mathematics. He earned his Ph.D. degree from Tsinghua University.

Yinya Huang

Yinya Huang

ETH AI Center, ETH Zurich

Yinya Huang is a Postdoc Fellow at the ETH AI Center, ETH Zurich. She received her Ph.D. in Computer Science from Sun Yat-sen University. Her research focuses on AI for mathematics and science, with emphasis on mathematical reasoning, formal theorem proving with LLMs, and reliable machine learning. She has published in top venues including TPAMI, NeurIPS, ICLR, ACL, and EMNLP, and is a recipient of the 2025 ETH Career Seed Award and the 2024 ACM SIGCSE China Excellent Doctoral Dissertation Award.

Sorrachai Yingchareonthawornchai

Sorrachai Yingchareonthawornchai

ETH Zurich

Sorrachai Yingchareonthawornchai is a Junior Fellow at the Institute for Theoretical Studies, ETH Zurich and Tech Lead of CSLib. His research interests lie in the design of fast graph algorithms, combinatorial optimization, and combinatorics of computation. He earned his Ph.D. in Computer Science from Aalto University in 2023, and his doctoral work was recognized with the Doctoral Thesis Award 2023 from Aalto University's School of Science. Previously, he was a postdoctoral researcher at the Hebrew University of Jerusalem and a Simons-Berkeley postdoctoral fellow at UC Berkeley.

Siyuan Guo

Siyuan Guo

Prior Labs

Siyuan Guo is an AI scientist at Prior Labs. She earned her PhD jointly from the University of Cambridge and the Max Planck Institute for Intelligent Systems. Her research focuses on the physics of learning, causal inference, and AI for scientific discovery. She has published at top-tier venues with 1 oral and 2 spotlight presentations. Her research has been recognized with honors including MIT EECS Rising Star and the G-Research PhD Prize.

Program Schedule

July 11, 2026  ·  8:00 a.m. to 5:00 p.m.  ·  Room HALL D1

Morning
Opening 8:00 – 8:15
Opening Remarks Organizing Committee & Dawn Song
Invited Talk 8:15 – 8:45
Jia Li Project Numina Title to be announced
Invited Talk 8:50 – 9:20
Leonardo de Moura AWS / Lean FRO Virtual · Title to be announced
Invited Talk 9:25 – 9:55
Siyuan Guo Prior Labs Title to be announced
Break 9:55 – 10:10
Coffee Break
Awards 10:10 – 10:25
Award Ceremony & Sponsor Acknowledgements
Oral 10:25 – 11:05
Oral Presentations 2 × 20 min
Posters 11:05 – 12:20
Poster Session
Break 12:20 – 13:00
Lunch Break
Afternoon
Challenge 13:00 – 14:00
Challenge Presentations 4 tracks · 15 min each
Invited Talk 14:00 – 14:30
Sorrachai Yingchareonthawornchai ETH Zurich Title to be announced
Invited Talk 14:35 – 15:05
Yinya Huang ETH AI Center Title to be announced
Break 15:05 – 15:20
Coffee Break
Invited Talk 15:20 – 15:50
Emily First Rutgers University Virtual · Title to be announced
Invited Talk 15:55 – 16:25
Zheng Yuan ByteDance Seed Title to be announced
Panel 16:25 – 16:55
Panel Discussion
Closing 16:55 – 17:00
Closing Remarks

This program is a tentative plan and subject to change. All talks will be delivered in person and live-streamed via Zoom.

Call for Papers

🏆 2× Best Paper Award ($1,500) · 🎖️ 3× Honorable Mention ($1,000)

Topics

We invite submissions on, but not limited to, the following topics:

Formal Theorem Proving

LLM-based theorem provers, tactic generation, proof search, interactive proof assistants (Lean, Rocq, Isabelle).

Autoformalization

Translating natural language mathematics into formal languages, semantic alignment, and faithfulness verification.

NL Mathematical Reasoning

Large language models for math, chain-of-thought reasoning, synthetic data generation, and self-improvement.

Scientific Problem Solving

Transfer of math AI techniques to TCS, physics, and other scientific domains; formal methods in science.

Multimodal Reasoning

Vision-language models for mathematical and scientific reasoning; diagrams, figures, and geometry.

Verification & Evaluation

Benchmarks, metrics, and methods for verifying AI-generated mathematical solutions and proofs.

Human-AI Collaboration

Effective methods for scientific human-AI collaboration, interactive tools, and AI-assisted discovery.

Scientific Agents

AI agents for systems science, causality, finance, bioinformatics, and adjacent areas.

Submission Details

  • Papers should be 2–8 pages (excluding references and supplementary materials)
  • All submissions must follow the ICML 2026 format
  • Review process via OpenReview, double-blind
  • This workshop is non-archival, submissions to or accepted at other venues are welcome
  • Accepted papers will be presented as posters; select papers invited for contributed talks

Important Dates (deadlines in AOE)

  • Paper Submission Open April 1, 2026
  • Paper Submission Deadline May 25, 2026
  • Decision Notification June 15, 2026
  • Camera-ready Deadline June 25, 2026

Best Paper Award

We will present a Best Paper Award and Best Paper Runner-Up at the workshop.

Past winners:

Challenge Tracks

Four competitive tracks on Codabench. Challenge period: May 1 – June 15, 2026.

🥇 $1,000 · 🥈 $600 · 🥉 $400 · for winners of each track
Track 1

Semantic Alignment Evaluation for Autoformalization

Evaluate whether autoformalization preserves mathematical intent, using the FormalRx-Test dataset built on the SCI Error Taxonomy (28 error categories).

  1. Verdict: Is the formalization semantically aligned?
  2. Categorization: Identify the error type
  3. Localization: Find the erroneous code segment
  4. Correction: Provide a corrected statement

Metric: Macro F1 (Tasks 1–2); exact match + LLM-as-judge (Tasks 3–4)

Track 2

Theoretical Computer Science Proving in Lean

Formal theorem proving in TCS using CSLib, an open-source Lean 4 library covering problems that go beyond Olympiad-style competition math.

  • Graph theory & combinatorics
  • Computational complexity
  • Algorithm correctness

Cheating prevention is enforced via leanprover/comparator.

Metric: Pass Rate (fraction of verified proofs)

Track 3

Visual Grounded Physics Problem Solving

Physics reasoning with genuine visual dependency using SeePhys Pro — 1,000 problems from middle school to research level, including real hand-drawn images.

  • Given a problem + diagram, derive the correct solution
  • Pure text-based reasoning will fail — visual perception is required

Metric: Exact Match + LLM-as-Judge

Track 4

End-to-End Autoformalization and Proof Generation in Lean 4

Full autoformalization pipeline using the ShadowBench — 126 problems across 8 mathematical areas, tiered L2/L3/L4.

  1. Given a Natural Language informal proof, generate the formal proof in Lean 4.

Metric:SA (Semantic Alignment) pass: Rate verified using the hidden theorems that are implied by the target theorem.

We strongly encourage participants to accompany their submission with a short technical report. Please follow the ICML 2026 paper format.

Committee

Workshop Organizers

Haocheng Wang

Haocheng Wang Contact

PhD Student

HKUST(GZ) / ETH Zurich

Kun Xiang

Kun Xiang Contact

PhD Student

Sun Yat-sen University / Alibaba

Zhizhen Qin

Zhizhen Qin

Applied Scientist

Amazon Web Services

Leni Aniva

Leni Aniva

PhD Student

Stanford University

Donghoon Hyeon

Donghoon Hyeon

Professor

Seoul National University

Xiaodan Liang

Xiaodan Liang

Professor

MBZUAI / Sun Yat-sen University

Bartosz Piotrowski

Bartosz Piotrowski

AI Researcher

Axiom Math

Zhijiang Guo

Zhijiang Guo

Assistant Professor

HKUST(GZ) / HKUST

Workshop Contributors

Rasmus Kyng

Rasmus Kyng

Associate Professor

ETH Zurich

Steve Liu

Steve Liu

Professor

MBZUAI

Marina Vinyes

Marina Vinyes

Team Lead

Project Numina

Soonho Kong

Soonho Kong

Principal Applied Scientist

Amazon Web Services

Zaiwen Wen

Zaiwen Wen

Professor

BICMR / Peking University

Tao Luo

Tao Luo

Associate Professor

SJTU

Junxian He

Junxian He

Assistant Professor

HKUST

Yilun Zhou

Yilun Zhou

Senior Research Scientist

Salesforce

Huajian Xin

Huajian Xin

PhD Student

University of Edinburgh / ByteDance

Zhicheng Yang

Zhicheng Yang

PhD Student

HKUST(GZ) / ByteDance

Weihao Zeng

Weihao Zeng

PhD Student

HKUST / Kimi

Yuzhen Huang

Yuzhen Huang

PhD Student

HKUST / DeepSeek

Huilin Zhen

Huilin Zhen

Researcher

Huawei

Mingxuan Yuan

Mingxuan Yuan

Researcher

Huawei

Xiaoyang Liu

Xiaoyang Liu

PhD Student

SJTU

Baiyu Huang

Baiyu Huang

MPhil Student

HKUST(GZ)

Bokai Zhou

Bokai Zhou

Student

Sun Yat-sen University

Competition Chairs

For any questions regarding the challenge tracks, please feel free to contact our competition chairs.

Soonho Kong

Soonho Kong

Principal Applied Scientist

Amazon Web Services

Haocheng Wang

Haocheng Wang

PhD Student

HKUST(GZ) / ETH Zurich

Sorrachai Yingchareonthawornchai

Sorrachai Yingchareonthawornchai

Junior Fellow

ETH Zurich

Kun Xiang

Kun Xiang

PhD Student

Sun Yat-sen University / Alibaba

Donghoon Hyeon

Donghoon Hyeon

Professor

Seoul National University

Sponsors & Partners

We are actively seeking sponsors to support our workshop. View our sponsorship prospectus or contact us if you are interested in sponsoring.

Sponsors

Amazon Web Services ByteDance Harmonic Huawei Pramaan Labs Snorkel AI

Partners

Axiom Math DeepSeek The University of Edinburgh ETH Zurich HKUST HKUST Guangzhou MBZUAI Peking University Prior Labs Project Numina Salesforce Seoul National University Shanghai Jiao Tong University Stanford University Sun Yat-sen University