Feng Yao

prof_pic.jpg
fengyao[AT]ucsd.edu

Hi! My name is Feng (姚峰 in Chinese).

I am a second-year Ph.D. student in Computer Science at UCSD, advised by Prof. Jingbo Shang and Prof. Vish Krishnan. Previously, I received my master’s degree from Tsinghua University, advised by Prof. Zhiyuan Liu and Prof. Weixing Shen.

My research interest genearlly lies in the intersection of Natural Language Processing and Deep Learning. Recently, I have been focusing on training Mixture-of-Experts (MoE) and improving the efficiency of large-scale Reinforcement Learning.

Feel free to reach out if you want to collaborate with me. :)

News

Jul 14, 2024

Invited by Cohere Labs to give a talk on DenseMixer. [Slides]

Jul 02, 2024

Invited by the Qwen Team to give a talk on DenseMixer. [Slides]

Jun 30, 2024

Released DenseMixer for MoE post-training! Check out the blog, code and X.

Selected Publications [Full]

  1. Preprint
    DenseMixer: Improving MoE Post-Training with Precise Router Gradients
    Feng Yao, Junxia Cui , Ruohan Zhang , Liyuan Liu , Shibo Hao , Li Zhang , Chengyu Dong , Shuohang Wang , Yelong Shen , Jianfeng Gao , and Jingbo Shang
    Preprint
  2. Preprint
    Training Language Models to Generate Quality Code with Program Analysis Feedback
    Feng Yao*, Zilong Wang* , Liyuan Liu , Junxia Cui , Li Zhong , Xiaohan Fu , Haohui Mai , Vish Krishnan , Jianfeng Gao , and Jingbo Shang
    Preprint
  3. Preprint
    Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
    Zhoujun Cheng* , Shibo Hao* , Tianyang Liu* , Fan Zhou , Yutao Xie , Feng Yao, Yuexin Bian , Yonghao Zhuang , Nilabjo Dey , Yuheng Zha , and  others
    Preprint
  4. EMNLP 2024
    Data Contamination Can Cross Language Barriers
    Feng Yao*, Yufan Zhuang* , Zihao Sun , Sunan Xu , Animesh Kumar , and Jingbo Shang
    EMNLP 2024
  5. ACL 2024
    Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs
    Shang Zhou* , Feng Yao*, Chengyu Dong , Zihan Wang , and Jingbo Shang
    Findings of ACL 2024
  6. Preprint
    Configurable Foundation Models: Building LLMs from a Modular Perspective
    Chaojun Xiao , Zhengyan Zhang , Chenyang Song , Dazhi Jiang , Feng Yao, Xu Han , Xiaozhi Wang , Shuo Wang , Yufei Huang , Guanyu Lin , Yingfa Chen , Weilin Zhao , Yuge Tu , Zexuan Zhong , Ao Zhang , Chenglei Si , Khai Hao Moo , Chenyang Zhao , Huimin Chen , Yankai Lin , Zhiyuan Liu , Jingbo Shang , and Maosong Sun
    Preprint
  7. CIKM 2023
    MUSER: A Multi-View Similar Case Retrieval Dataset
    Qingquan Li , Yiran Hu , Feng Yao, Chaojun Xiao , Zhiyuan Liu , Maosong Sun , and Weixing Shen
    CIKM 2023, (Best Resource Paper Honorable Mention)

Experience

Talks

  • DenseMixer: Improving MoE Post-Training with Precise Router Gradient.