doi.bio/zeyuan_allen-zhu


Zeyuan Allen-Zhu

Zeyuan Allen-Zhu is a research scientist at Meta, specialising in the physics of language models and AI.

Education

Zeyuan Allen-Zhu received his Doctor of Science in Computer Science from the Massachusetts Institute of Technology (MIT), advised by Jon Kelner and Silvio Micali. He also holds a Master's degree in Computer Science and a Bachelor's degree in Mathematics and Physics, both from MIT, summa cum laude. During his Bachelor's, he was awarded the Chi-Sun Yeh prize for his major in physics.

Career

Zeyuan Allen-Zhu is currently an AI research scientist at Meta/FAIR Labs, a position he has held since 2022. Prior to this, he was a senior researcher at Microsoft Research Redmond from 2017, becoming a principal researcher during his time there. From 2015 to 2017, he was a postdoc at Princeton and IAS, hosted by Elad Hazan and Avi Wigderson.

Research

Zeyuan Allen-Zhu's research focuses on investigating the physics of language models and AI, designing experiments to uncover the fundamental principles governing how transformers/GPTs learn to perform various tasks. He aims to understand the intricate physical mechanisms behind large language models by probing the neurons of pre-trained transformers.

Previously, he worked on the mathematics of deep learning, developing theoretical proofs to explain the learnability of neural networks and certain phenomena observed in deep learning. He has also worked in machine learning, optimisation theory, and theoretical computer science.

Awards and Recognition

Zeyuan Allen-Zhu has received several awards and recognition for his work. His paper on ensemble/knowledge distillation received an award from ICLR'23. He also holds the following accolades:

Publications

Zeyuan Allen-Zhu has numerous publications, including:

Youtube Videos

Youtube Title: Theory of accelerated methods - Zeyuan Allen-Zhu

Youtube Link: link

Youtube Channel Name: Institute for Advanced Study

Youtube Channel Link: https://www.youtube.com/@videosfromIAS

Theory of accelerated methods - Zeyuan Allen-Zhu

Youtube Title: Accelerated stochastic gradient ..first-order optimization - Zeyuan Allen-Zhu

Youtube Link: link

Youtube Channel Name: Institute for Advanced Study

Youtube Channel Link: https://www.youtube.com/@videosfromIAS

Accelerated stochastic gradient ..first-order optimization - Zeyuan Allen-Zhu

Youtube Title: Follow the Compressed Leader: Faster Online Learning of Eigenvectors and Faster MMWU

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Follow the Compressed Leader: Faster Online Learning of Eigenvectors and Faster MMWU

Youtube Title: Three ICML 2016 Talks on Optimization

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Three ICML 2016 Talks on Optimization

Youtube Title: ICML 2017 Tutorial: Recent Advances in Stochastic Convex and Non-Convex Optimization

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

ICML 2017 Tutorial: Recent Advances in Stochastic Convex and Non-Convex Optimization

Youtube Title: Using Optimization to Solve Positive LPs Faster in Parallel

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Using Optimization to Solve Positive LPs Faster in Parallel

Youtube Title: Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter

Youtube Title: ICML 2017 Tutorial: Recent Advances in Stochastic Convex and Non-Convex Optimization (audio fixed)

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

ICML 2017 Tutorial: Recent Advances in Stochastic Convex and Non-Convex Optimization (audio fixed)

Youtube Title: LazySVD: Even Faster SVD Decomposition Yet Without Agonizing Pain

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

LazySVD: Even Faster SVD Decomposition Yet Without Agonizing Pain

Youtube Title: Linear Coupling of Gradient and Mirror Descent

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Linear Coupling of Gradient and Mirror Descent

Youtube Title: Optimal Black-Box Reductions Between Optimization Objectives

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Optimal Black-Box Reductions Between Optimization Objectives

Youtube Title: Optimal Experimental Design via A New Regret Minimization Framework

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Optimal Experimental Design via A New Regret Minimization Framework

Youtube Title: Katyusha X: Practical Momentum Method for Stochastic Sum-of-Nonconvex Optimization

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Katyusha X: Practical Momentum Method for Stochastic Sum-of-Nonconvex Optimization

Youtube Title: First Efficient Convergence for Streaming k-PCA: a Global, Gap-Free, and Near-Optimal Rate

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

First Efficient Convergence for Streaming k-PCA: a Global, Gap-Free, and Near-Optimal Rate

Youtube Title: How to Swing By Saddle Points: Faster Non-Convex Optimization Than SGD

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

How to Swing By Saddle Points: Faster Non-Convex Optimization Than SGD

Youtube Title: Backward Feature Correction: How Deep Learning Performs Deep Learning (May 2020 by Yuanzhi Li)

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Backward Feature Correction: How Deep Learning Performs Deep Learning (May 2020 by Yuanzhi Li)

Youtube Title: 03 - Allen-Zhu - Linear Coupling of Gradient and Mirror Descent

Youtube Link: link

Youtube Channel Name: ITCS Conference

Youtube Channel Link: https://www.youtube.com/@itcsconference6649

03 - Allen-Zhu - Linear Coupling of Gradient and Mirror Descent

Youtube Title: Nearly-Linear Time Positive LP Solver with Faster Convergence Rate (STOC 2015)

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Nearly-Linear Time Positive LP Solver with Faster Convergence Rate (STOC 2015)

Youtube Title: Knightian Self Uncertainty in the VCG Mechanism for Unrestricted Combinatorial Auctions

Youtube Link: link

Youtube Channel Name: Zeyuan Allen-Zhu

Youtube Channel Link: https://www.youtube.com/@zhuzeyuan

Knightian Self Uncertainty in the VCG Mechanism for Unrestricted Combinatorial Auctions