Stable Asynchrony: Variance-Controlled Off-Policy RL for LLMs
Luke J. Huang, Zhuoyang Zhang, Qinghao Hu, Shang Yang, Song Han
lukh23@mit.edu Physics + CS @ MIT
Hey! I'm Luke. I'm a Physics + CS student at MIT and a Member of Technical Staff Resident at OpenAI. My work focuses on reasoning, efficient AI computing, reinforcement learning for LLMs, and fast generative models.
Luke J. Huang, Zhuoyang Zhang, Qinghao Hu, Shang Yang, Song Han
Zhuoyang Zhang*, Luke J. Huang*, Chengyue Wu, Shang Yang, Kelly Peng, Yao Lu, Song Han
Zhuoyang Zhang, Shang Yang, Qinghao Hu, Luke J. Huang, Jianing Hou, Yifu Sun, Yao Lu, Song Han
Shiekh Zia Uddin*, Sachin Vaidya*, Sarthak Choudhary, Zhuo Chen, Ronald K. Salib, Luke J. Huang, Dirk R. Englund, Marin A. Soljacic
Member of Technical Staff Resident, San Francisco
Working on pretraining.
Research Scientist Intern, San Francisco
Built large-scale RL training infrastructure for out-of-distribution reasoning tasks, including an asynchronous RL framework and a Torchtitan/FSDP backend for 100B+ MoE models.
Researcher, Cambridge
Researching robust asynchronous RL, efficient autoregressive image generation, and low-latency generative models for VLA systems.
Physics + CS, 5.0/5.0 GPA
Distributed Systems, Mathematical Statistics, Deep Learning, Information Theory, Inference and Information, Design and Analysis of Algorithms, Algebra I, Quantum Physics III.
Putnam Top 200, 2024 U.S. IPhO Team and Gold Medalist, Regeneron STS Finalist, Math Olympiad Program attendee, USA(J)MO winner, and RSI Scholar.