Feng Yao

fengyao[AT]ucsd.edu

Hi! My name is Feng (姚峰 in Chinese).

I am a third-year Ph.D. student in Computer Science at UCSD, advised by Prof. Jingbo Shang and Prof. Vish Krishnan. Previously, I received my master’s degree from Tsinghua University, advised by Prof. Zhiyuan Liu and Prof. Weixing Shen.

My research interest genearlly lies in the intersection of Natural Language Processing and Deep Learning. Recently, I have been focusing on training Mixture-of-Experts (MoE) and improving the efficiency of large-scale Reinforcement Learning.

Feel free to reach out if you want to collaborate with me. :)

News

Oct 30, 2025	Invited by Applied Compute (ex-OpenAI startup) to give a talk on TIS & FlashRL.
Oct 13, 2025	Invited by Tencent Hunyuan Team to give a talk on TIS & FlashRL.
Sep 25, 2025	Invited by H2Lab@UW to give a talk on TIS & FlashRL. [Slides]
Aug 25, 2025	Invited by MiniMax and Sea AI Lab to give a talk on DenseMixer, TIS & FlashRL.
Aug 24, 2025	Invited by TsinghuaNLP and ModelBest to give a talk on TIS & FlashRL. [Slides]
Aug 17, 2025	Invited by QingkeAI to give a talk on TIS & FlashRL.
Aug 12, 2025	Invited by Kuaishou Klear Team to give a talk on DenseMixer, TIS & FlashRL.
Jul 14, 2025	Invited by Cohere Labs to give a talk on DenseMixer. [Slides]
Jul 02, 2025	Invited by the Qwen Team to give a talk on DenseMixer. [Slides]
Jun 30, 2025	Released DenseMixer for MoE post-training! Check out the blog, code and X.

Selected Publications [Full]

Preprint

FlashRL: 8Bit Rollouts, Full Power RL

Liyuan Liu* , Feng Yao*, Dinghuai Zhang , Chengyu Dong , Jingbo Shang , and Jianfeng Gao

Preprint 2025

HTML Code
Preprint

Your Efficient RL Framework Secretly Brings You Off-Policy RL Training

Feng Yao*, Liyuan Liu* , Dinghuai Zhang , Chengyu Dong , Jingbo Shang , and Jianfeng Gao

Preprint 2025

HTML Code
Preprint

DenseMixer: Improving MoE Post-Training with Precise Router Gradients

Feng Yao, Junxia Cui , Ruohan Zhang , Liyuan Liu , Shibo Hao , Li Zhang , Chengyu Dong , Shuohang Wang , Yelong Shen , Jianfeng Gao , and Jingbo Shang

Preprint 2025

HTML Code
NeurIPS 2025

Training Language Models to Generate Quality Code with Program Analysis Feedback

Feng Yao*, Zilong Wang* , Liyuan Liu , Junxia Cui , Li Zhong , Xiaohan Fu , Haohui Mai , Vish Krishnan , Jianfeng Gao , and Jingbo Shang

NeurIPS 2025

PDF Code
Preprint

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Zhoujun Cheng* , Shibo Hao* , Tianyang Liu* , Fan Zhou , Yutao Xie , Feng Yao, Yuexin Bian , Yonghao Zhuang , Nilabjo Dey , Yuheng Zha , and others

Preprint 2025

PDF Code
EMNLP 2024

Data Contamination Can Cross Language Barriers

Feng Yao*, Yufan Zhuang* , Zihao Sun , Sunan Xu , Animesh Kumar , and Jingbo Shang

EMNLP 2024

PDF Code
Preprint

Configurable Foundation Models: Building LLMs from a Modular Perspective

Chaojun Xiao , Zhengyan Zhang , Chenyang Song , Dazhi Jiang , Feng Yao, Xu Han , Xiaozhi Wang , Shuo Wang , Yufei Huang , Guanyu Lin , Yingfa Chen , Weilin Zhao , Yuge Tu , Zexuan Zhong , Ao Zhang , Chenglei Si , Khai Hao Moo , Chenyang Zhao , Huimin Chen , Yankai Lin , Zhiyuan Liu , Jingbo Shang , and Maosong Sun

Preprint 2024

PDF

Experience

Amazon Rufus Foundation Model Team | Jun 2025 – Sep 2025
Topic: Post-Training for LLM Agent
Hosts: Zheng Li, Changlong Yu, Shuowei Jin, Lihong Li
Microsoft Research & GenAI | Jun 2024 – Mar 2025
Topic: MoE Training / Reinforcement Learning
Hosts: Liyuan Liu, , Shuohang Wang, Yelong Shen, , Jianfeng Gao

Invited Talks

On the Rollout-Training Mismatch in Modern RL Systems
- @ Applied Compute (ex-OpenAI startup), November 10, 2025 [slides]
- @ Tencent Hunyuan Team, October 23, 2025 [slides]
- @ H2Lab@UW, October 22, 2025 [slides]
- @ SeaAI Lab, September 03, 2025 [slides]
- @ MiniMax, September 01, 2025 [slides]
- @ QingkeAI, August 29, 2025 [slides]
- @ TsinghuaNLP & ModelBest, August 27, 2025 [slides]
- @ Kuaishou Klear Team, August 19, 2025 [slides]

DenseMixer: Improving MoE Post-Training with Precise Router Gradient.
- @ MiniMax, September 01, 2025 [slides]
- @ Kuaishou Klear Team, August 19, 2025 [slides]
- @ Cohere Lab, July 10, 2025 [slides] [recording]
- @ Alibaba Qwen Team, July 09, 2025 [slides]