|
Dongyang Fan
Hi! Thanks for stopping by :) I'm a passionate 4th-year PhD student at Machine Learning and Optimization Lab at EPFL, supervised by Prof. Martin Jaggi. My name is pronounced as Don-Young.
My research interests include:
- Data-Efficient Language Modeling
- Mixture-of-Experts architectures
- Decentralized training methods
- Accelerating LLM pretraining through metadata conditioning
- Responsible Language Modeling
- Data-compliant pretraining by respecting owners’ opt-out choices.
- Designing compensation frameworks for data contributors
- Understanding and mitigating model hallucinations
I am also happy to branch out my research. If you want to reach out, do not hesitate to drop me an email!
Email /
Google Scholar /
Semantic Scholar /
Twitter /
Github /
LinkedIn
|
|
|
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
Apertus team (as a member of the pretraining team)
preprint, 2025
arXiv
|
|
TiMoE: Time-Aware Mixture of Language Experts
Robin Faro*,
Dongyang Fan*,
Tamar Alphaidze,
Martin Jaggi
XTempLLMs workshop @ COLM (oral 🏆), 2025
arXiv
/
codes
/
slides
|
|
URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training
Dongyang Fan,
Vinko Sabolčec,
Martin Jaggi,
Conference on Neural Information Processing Systems (NeurIPS), 2025
arXiv
|
|
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
Dongyang Fan,
Vinko Sabolčec,
Matin Ansaripour,
Ayush Kumar Tarun,
Martin Jaggi,
Antoine Bosselut,
Imanol Schlag
Conference on Language Modeling (COLM, oral 🏆), 2025
arXiv
/
codes
/
project page
/
slides
|
|
Do Data Valuations Make Good Data Prices?
Dongyang Fan,
Tyler J. Rotello,
Sai Praneeth Karimireddy
ICLR Workshop Data Problems, 2025
arXiv
|
|
On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists
Dongyang Fan*,
Bettina Messmer*,
Nikita Doikov,
Martin Jaggi
International Conference on Machine Learning (ICML), 2025
codes
/
arXiv
|
|
Towards an empirical understanding of MoE design choices
Dongyang Fan*,
Bettina Messmer*,
Martin Jaggi
ICLR ME-FoMo Workshop, 2024
arXiv
|
|
Personalized Collaborative Fine-Tuning for On-Device Large Language Models
Nicolas Wagner,
Dongyang Fan,
Martin Jaggi
Conference on Language Modeling (COLM), 2024
codes
/
arXiv
|
|
Ghost Noise for Regularizing Deep Neural Networks
Atli Kosson,
Dongyang Fan,
Martin Jaggi
Association for the Advancement of Artificial Intelligence (AAAI), 2024
arXiv
|
|
Collaborative Learning via Prediction Consensus
Dongyang Fan,
Celestine Mendler-Dünner,
Martin Jaggi
Conference on Neural Information Processing Systems (NeurIPS), 2023
codes
/
arXiv
/
poster
|
Academic Service
-
Reviewer for NeurIPS 2025 (Top Reviewer ⭐️), 2024; ICLR 2025, 2023; COLM 2025 and multiple workshops.
-
Supervison of student projects: My supervised student projects have resulted in a COLM paper and an oral presentation at a COLM workshop.
|
Miscellaneous
In general I like arts and cultural stuff. I am also an outdoorsy person and I do hiking skiing and sailing.
I paint from my hiking trips. For example...
|
Source codes of the website are from here.
|
|