|
Dongyang Fan
Hi! Thanks for stopping by :) I'm a passionate 4th-year PhD student at Machine Learning and Optimization Lab at EPFL, supervised by Prof. Martin Jaggi. My name is pronounced as Don-Young.
My research interests include:
- Data-Efficient Language Modeling
- Mixture-of-Experts architectures
- Decentralized training methods
- Accelerating LLM pretraining through metadata conditioning
- Responsible Language Modeling
- Data-compliant pretraining by respecting owners’ opt-out choices.
- Designing compensation frameworks for data contributors
- Understanding and mitigating model hallucinations
I am also happy to branch out my research. If you want to reach out, do not hesitate to drop me an email!
Email /
Google Scholar /
Semantic Scholar /
Twitter /
Github /
LinkedIn
|
|
Research
Note: WS denotes Workshop
|
|
preprint 2026
|
HalluHard: A Hard Multi-Turn Hallucination Benchmark
Dongyang Fan*,
Sebastien Delsad*,
Nicolas Flammarion,
Maksym Andriushchenko
arXiv
/
code
/
website
Hallucination
Benchmark
LLM Evaluation
|
|
ICLR 2026
|
Beyond URLs: Metadata Diversity and Position for Efficient LLM Pretraining
Dongyang Fan*,
Diba Hashemi*,
Sai Praneeth Karimireddy,
Martin Jaggi
arXiv
LLM Pretraining
Metadata
Data Efficiency
|
|
preprint 2025
|
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
Apertus team (as a member of the pretraining team)
arXiv
Open LLMs
Data Compliance
Multilingual
|
|
COLM 2025 WS (oral 🏆)
|
TiMoE: Time-Aware Mixture of Language Experts
Robin Faro*,
Dongyang Fan*,
Tamar Alphaidze,
Martin Jaggi
arXiv
/
codes
/
slides
Mixture-of-Experts
Temporal Modeling
LLM Architecture
|
|
NeurIPS 2025
|
URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training
Dongyang Fan,
Vinko Sabolčec,
Martin Jaggi,
arXiv /
codes
LLM Training
Metadata
Data Efficiency
Steering
|
|
COLM 2025 (oral 🏆)
|
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
Dongyang Fan,
Vinko Sabolčec,
Matin Ansaripour,
Ayush Kumar Tarun,
Martin Jaggi,
Antoine Bosselut,
Imanol Schlag
arXiv
/
codes
/
project page
/
slides
Data Compliance
Ethics
Memorization
Responsible AI
|
|
ICLR WS 2025
|
Do Data Valuations Make Good Data Prices?
Dongyang Fan,
Tyler J. Rotello,
Sai Praneeth Karimireddy
arXiv
Data Valuation
Game Theory
Data Markets
|
|
ICML 2025
|
On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists
Dongyang Fan*,
Bettina Messmer*,
Nikita Doikov,
Martin Jaggi
codes
/
arXiv
Collaborative Learning
Mixture-of-Experts
On-Device
Federated Learning
|
|
ICLR WS 2024
|
Towards an empirical understanding of MoE design choices
Dongyang Fan*,
Bettina Messmer*,
Martin Jaggi
arXiv
Mixture-of-Experts
Architecture Design
Expert Specialization
|
|
COLM 2024
|
Personalized Collaborative Fine-Tuning for On-Device Large Language Models
Nicolas Wagner,
Dongyang Fan,
Martin Jaggi
codes
/
arXiv
Collaborative Learning
Personalization
On-Device
Fine-Tuning
|
|
AAAI 2024
|
Ghost Noise for Regularizing Deep Neural Networks
Atli Kosson,
Dongyang Fan,
Martin Jaggi
arXiv
Regularization
Deep Learning
Generalization
|
|
NeurIPS 2023
|
Collaborative Learning via Prediction Consensus
Dongyang Fan,
Celestine Mendler-Dünner,
Martin Jaggi
codes
/
arXiv
/
poster
Collaborative Learning
Consensus
Knowledge Distillation
|
Academic Service
-
Reviewer for NeurIPS 2025 (Top Reviewer ⭐️), 2024; ICLR 2025, 2023; COLM 2025 and multiple workshops.
-
Supervison of student projects: My supervised student projects have resulted in a COLM paper and an oral presentation at a COLM workshop.
|
Miscellaneous
In general I like arts and cultural stuff. I am also an outdoorsy person and I do hiking skiing and sailing.
I paint from my hiking trips. For example...
|
Source codes of the website are from here.
|
|