Zeyuan Allen-Zhu's Home Page

FAQs

If you previously had my WeChat, please add my new one by replacing the '-' symbol in my wechat ID with '_'.

Education & Work

Zeyuan on the photo day of MSR AI
@ Microsoft Research Redmond

Facebook AI Research (FAIR) at Meta – not GenAI, not MSL – (2022 – present)
- AI research scientist, in Seattle/Bellevue office
Microsoft Research Redmond (2017 – 2022)
- senior -> principal researcher
PRINCETON and Institute for Advanced Study (2015 - 2017)
- postdoc (hosted by Elad Hazan and Avi Wigderson)
MIT, Csail (2010 – 2015)
- Sc.D. in computer science (advised by Jon Kelner and Silvio Micali)
- M.S. in computer science (advised by Silvio Micali)
Tsinghua, Department of Physics (2006 – 2010)
- B.S. in mathematics and physics (summa cum laude)
- Academic talent program (基科班)
- Chi-Sun Yeh prize for physics major
NFLS (2000 – 2006)
- high school diplomat with English major

Some Awards

In algorithm competitions, I was fortunate to win a few awards in my past life, including two IOI gold medals, a USACO world champion, an ACM/ICPC world-final gold medal (2nd place), a Google Codejam world runner-up, and a USA MCM Oustanding Prize.

In research, I used to be supported by a Microsoft Young Fellow Award, a Simons Student Award and a Microsoft Azure Research Award.

For a full list, click here.

Personal Information

Research Interests

My current research focuses on investigating the physics of language models and AI in a broader sense. This involves designing experiments to elucidate the underlying universal principles governing how LLMs learn to accomplish diverse AI tasks. By probing into the neurons, one can uncover intricate (and surprising!) mechanisms behind how these AI models function. The ultimate goal is to provide theoretical guidance and practical suggestions on how we can ultimately achieve AGI. This line of work is featured at ICML2024 tutorial.

Before that, I work on the mathematics of deep learning. That involves developing rigorous theoretical proofs towards the learnability of neural networks, in ideal and theory-friendly settings, to explain certain mysterious phenomena observed in deep learning. In this area, our paper on ensemble / knowledge distillation received some award from ICLR'23; although I am most proud of our COLT'23 result that provably shows why deep learning is actually deep –– better than shallow learners such as layer-wise training, kernel methods, etc.

In my past life, I have also worked in machine learning, optimization theory, and theoretical computer science.

Incredibly honored and humbled by the overwhelming response to my tutorial, and thank you everyone who attended in person. Truly heartwarming to hear how much you enjoyed it. Many have been asking for a recording, and I prepared one with my own subtitles https://t.co/RjTm9ZHpId https://t.co/PFi2elHnsi pic.twitter.com/hBy1aPzIFU
— Zeyuan Allen-Zhu (@ZeyuanAllenZhu) July 25, 2024

Zeyuan Allen-Zhu, Sc.D.

Pages

FAQs

Education & Work

Some Awards

Personal Information

Research Interests

Email