You're the th
visitor.
Last Update: May 23, 2023.
(I've recently moved my homepage to this location, so please bear with me as I work out any bugs or issues that may arise.)
My current research focuses on investigating the physics of language models and AI in a broader sense. This involves designing experiments to elucidate the underlying fundamental principles governing how transformers/GPTs learn to accomplish diverse AI tasks. By probing into the neurons of the pre-trained transformers, my goal is to uncover and comprehend the intricate (and sometimes surprising!) physical mechanisms behind large language models.
Before that, I work on the mathematics of deep learning. That involves developing rigorous theoretical proofs towards the learnability of neural networks, in ideal and theory-friendly settings, to explain certain mysterious phenomena observed in deep learning.
In my past life, I have also worked in machine learning, optimization theory, and theoretical computer science.
In algorithm competitions, I was fortunate to win a few awards in my past life, including two IOI gold medals, a USACO world champion, an ACM/ICPC world-final gold medal, a Google Codejam world runner-up, and a USA MCM Top Prize.
In research, I used to be supported by a Microsoft Young Fellow Award, a Simons Student Award and a Microsoft Azure Research Award.
For a full list, click here.