Feng Yao

Hi! My name is Feng (姚峰 in Chinese).
I am a second-year Ph.D. student in Computer Science at UCSD, advised by Prof. Jingbo Shang and Prof. Vish Krishnan. Previously, I received my master’s degree from Tsinghua University, advised by Prof. Zhiyuan Liu and Prof. Weixing Shen.
My research interest genearlly lies in the intersection of Natural Language Processing and Deep Learning. Recently, I have been focusing on training Mixture-of-Experts (MoE) and improving the efficiency of large-scale Reinforcement Learning.
Feel free to reach out if you want to collaborate with me. :)
News
Jul 14, 2024 | Invited by Cohere Labs to give a talk on DenseMixer. [Slides] |
---|---|
Jul 02, 2024 | Invited by the Qwen Team to give a talk on DenseMixer. [Slides] |
Jun 30, 2024 | Released DenseMixer for MoE post-training! Check out the blog, code and X. |
Selected Publications [Full]
- Preprint
Experience
- Amazon Rufus Foundation Model Team | Jun 2025 – Sep 2025
Topic: Post-Training for LLM Agent
Hosts: Zheng Li, Xinyang Zhang, Changlong Yu, Shuowei Jin - Microsoft Research & GenAI | Jun 2024 – Mar 2025
Topic: MoE Pretraining / Reinforcement Learning
Hosts: Liyuan Liu, Yelong Shen, Shuohang Wang
Talks
- DenseMixer: Improving MoE Post-Training with Precise Router Gradient.
- @ Cohere Lab, July 10, 2024 [slides] [recording]
- @ Alibaba Qwen Team, July 09, 2024 [slides]