Feng Yao

prof_pic.jpg
fengyao[AT]ucsd.edu

Hi! My name is Feng (姚峰 in Chinese). I am a second-year Ph.D. student in Computer Science at UCSD, advised by Prof. Jingbo Shang and Prof. Vish Krishnan. Previously, I received my master’s degree from Tsinghua University advised by Prof. Zhiyuan Liu and Prof. Weixing Shen.

My research interest genearlly lies in Natural Language Processing. Since 2024 Summer, I have been a student researcher at Microsoft Research & GenAI, working on Sparse MoE Pretraining with Dr. Liyuan Liu and Dr. Yelong Shen. Currently, I am interested in:

  • Developing modular and scalable LLMs (e.g., Mixture-of-Experts)
  • Understanding the mechanistic interpretability of LLMs
  • Building interdisciplinary (e.g., business, law) applications

Selected Publications [Full]

  1. Preprint
    Configurable Foundation Models: Building LLMs from a Modular Perspective
    Chaojun Xiao , Zhengyan Zhang , Chenyang Song , Dazhi Jiang , Feng Yao, Xu Han , Xiaozhi Wang , Shuo Wang , Yufei Huang , Guanyu Lin , Yingfa Chen , Weilin Zhao , Yuge Tu , Zexuan Zhong , Ao Zhang , Chenglei Si , Khai Hao Moo , Chenyang Zhao , Huimin Chen , Yankai Lin , Zhiyuan Liu , Jingbo Shang , and Maosong Sun
    Preprint
  2. EMNLP 2024
    Data Contamination Can Cross Language Barriers
    Feng Yao*, Yufan Zhuang* , Zihao Sun , Sunan Xu , Animesh Kumar , and Jingbo Shang
    EMNLP 2024
  3. ACL 2024
    Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs
    Shang Zhou* , Feng Yao*, Chengyu Dong , Zihan Wang , and Jingbo Shang
    Findings of ACL 2024
  4. ACL 2024
    Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph
    Xiaochen Gao* , Feng Yao*, Kewen Zhao , Beilei He , Animesh Kumar , Vish Krishnan , and Jingbo Shang
    ACL 2024, (Oral)
  5. AAAI 2023
    Unsupervised Legal Evidence Retrieval via Contrastive Learning with Approximate Aggregated Positive
    Feng Yao, Jingyuan Zhang , Yating Zhang , Xiaozhong Liu , Changlong Sun , Yun Liu , and Weixing Shen
    AAAI 2023, (Oral)
  6. ACL 2023
    The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation
    Hao Peng* , Xiaozhi Wang* , Feng Yao*, Kaisheng Zeng , Lei Hou , Juanzi Li , Zhiyuan Liu , and Weixing Shen
    Findings of ACL 2023
  7. CIKM 2023
    MUSER: A Multi-View Similar Case Retrieval Dataset
    Qingquan Li , Yiran Hu , Feng Yao, Chaojun Xiao , Zhiyuan Liu , Maosong Sun , and Weixing Shen
    CIKM 2023, (Best Resource Paper Honorable Mention)
  8. ACL 2022
    LEVEN: A Large-Scale Chinese Legal Event Detection Dataset
    Feng Yao*, Chaojun Xiao* , Xiaozhi Wang , Zhiyuan Liu , Lei Hou , Cunchao Tu , Juanzi Li , Yun Liu , Weixing Shen , and Maosong Sun
    Findings of ACL 2022