Feng Yao

prof_pic.jpg
fengyao[AT]ucsd.edu

Hi! My name is Feng (姚峰 in Chinese).

I am a second-year Ph.D. student in Computer Science at UCSD, advised by Prof. Jingbo Shang and Prof. Vish Krishnan. Previously, I received my master’s degree from Tsinghua University, advised by Prof. Zhiyuan Liu and Prof. Weixing Shen.

My research interest genearlly lies in the intersection of Natural Language Processing and Deep Learning. Recently, I have been focusing on training Mixture-of-Experts (MoE) and improving the efficiency of large-scale Reinforcement Learning.

Feel free to reach out if you want to collaborate with me. :)

News

Aug 25, 2025

Invited by MiniMax and Sea AI Lab to give a talk on DenseMixer, TIS & FlashRL.

Aug 24, 2025

Invited by TsinghuaNLP and ModelBest to give a talk on TIS & FlashRL. [Slides]

Aug 17, 2025

Invited by QingkeAI to give a talk on TIS & FlashRL.

Aug 12, 2025

Invited by Kuaishou Klear Team to give a talk on DenseMixer, TIS & FlashRL.

Jul 14, 2025

Invited by Cohere Labs to give a talk on DenseMixer. [Slides]

Jul 02, 2025

Invited by the Qwen Team to give a talk on DenseMixer. [Slides]

Jun 30, 2025

Released DenseMixer for MoE post-training! Check out the blog, code and X.

Selected Publications [Full]

  1. Preprint
    FlashRL: 8Bit Rollouts, Full Power RL
    Liyuan Liu* , Feng Yao*, Dinghuai Zhang , Chengyu Dong , Jingbo Shang , and Jianfeng Gao
    Preprint 2025
  2. Preprint
    Your Efficient RL Framework Secretly Brings You Off-Policy RL Training
    Feng Yao*, Liyuan Liu* , Dinghuai Zhang , Chengyu Dong , Jingbo Shang , and Jianfeng Gao
    Preprint 2025
  3. Preprint
    DenseMixer: Improving MoE Post-Training with Precise Router Gradients
    Feng Yao, Junxia Cui , Ruohan Zhang , Liyuan Liu , Shibo Hao , Li Zhang , Chengyu Dong , Shuohang Wang , Yelong Shen , Jianfeng Gao , and Jingbo Shang
    Preprint 2025
  4. Preprint
    Training Language Models to Generate Quality Code with Program Analysis Feedback
    Feng Yao*, Zilong Wang* , Liyuan Liu , Junxia Cui , Li Zhong , Xiaohan Fu , Haohui Mai , Vish Krishnan , Jianfeng Gao , and Jingbo Shang
    Preprint 2025
  5. Preprint
    Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
    Zhoujun Cheng* , Shibo Hao* , Tianyang Liu* , Fan Zhou , Yutao Xie , Feng Yao, Yuexin Bian , Yonghao Zhuang , Nilabjo Dey , Yuheng Zha , and  others
    Preprint 2025
  6. EMNLP 2024
    Data Contamination Can Cross Language Barriers
    Feng Yao*, Yufan Zhuang* , Zihao Sun , Sunan Xu , Animesh Kumar , and Jingbo Shang
    EMNLP 2024
  7. Preprint
    Configurable Foundation Models: Building LLMs from a Modular Perspective
    Chaojun Xiao , Zhengyan Zhang , Chenyang Song , Dazhi Jiang , Feng Yao, Xu Han , Xiaozhi Wang , Shuo Wang , Yufei Huang , Guanyu Lin , Yingfa Chen , Weilin Zhao , Yuge Tu , Zexuan Zhong , Ao Zhang , Chenglei Si , Khai Hao Moo , Chenyang Zhao , Huimin Chen , Yankai Lin , Zhiyuan Liu , Jingbo Shang , and Maosong Sun
    Preprint 2024

Experience

Invited Talks

  • On the Rollout-Training Mismatch in Modern RL Systems
    • @ SeaAI Lab, September 03, 2025 [slides]
    • @ MiniMax, September 01, 2025 [slides]
    • @ QingkeAI, August 29, 2025 [slides]
    • @ TsinghuaNLP & ModelBest, August 27, 2025 [slides]
    • @ Kuaishou Klear Team, August 19, 2025 [slides]
  • DenseMixer: Improving MoE Post-Training with Precise Router Gradient.