Xuechen Li
Research
I'm a PhD candidate at Stanford CS. I go by Chen, the second segment of my first name.
I research statistical machine learning, deep learning, and natural language processing. I work on developing methods to make machine learning trustworthy and aligned. Here are some recent topics of interest:
Learning From Human Feedback: Human feedback has become a primary driver for recent successes in AI such as ChatGPT. But
collecting and training on such data can be costly and cumbersome. Some of the questions I'm recently interested in
are: How can we efficiently elicit high-quality feedback? How can we augment the feedback data when they come
limited in quantity? How should we aggregate this feedback signal without marginalizing the minority voices and
views?
Red Teaming & Auditing: Despite the rapid progress in capability research, machine learning models still show
systematic flaws. I am interested in building automated tools to aid humans in discovering and fixing these flaws.
Memorization, Generalization, Influence, & Privacy: Large models can memorize training data. This poses
privacy risks
and raises emergent sociotechnical questions (e.g., on copyright and intellectual property). I am interested in
understanding the memorization phenomenon and building tools to mitigate undesirable consequences of this. Here is a slightly outdated statement I
wrote on privacy and security in machine learning.
-
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
Yann Dubois, Xuechen Li, Rohan Taori, Tianyi Zhang, Ishaan Gulrajani, Jimmy Ba, Carlos Guestrin, Percy Liang, and Tatsunori B. Hashimoto
[paper]
[code]
[blog]
-
Alpaca: A Strong, Replicable Instruction-Following Model
Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori B. Hashimoto
[code]
[blog]
-
Foundation Models and Fair Use
Peter Henderson, Xuechen Li, Dan Jurafsky, Tatsunori Hashimoto, Mark A. Lemley, Percy Liang
arXiv:2303.15715
-
Exploring the Limits of Differentially Private Deep Learning with Group-wise Clipping
Jiyan He, Xuechen Li, Da Yu, Huishuai Zhang, Janardhan Kulkarni, Yin Tat Lee, Arturs Backurs, Nenghai Yu, Jiang Bian
International Conference on Learning Representations, 2023
-
Large Language Models Can Be Strong Differentially Private Learners
Xuechen Li, Florian Tramer, Percy Liang, Tatsunori Hashimoto
International Conference on Learning Representations, 2022
[Oral presentation]
NeurIPS Privacy in Machine Learning Workshop, 2021
[Oral presentation]
[code]
[slides]
-
When Does Preconditioning Help or Hurt Generalization?
Shun-ichi Amari, Jimmy Ba, Roger Grosse, Xuechen Li, Atsushi Nitanda, Taiji Suzuki, Denny Wu, Ji Xu
International Conference on Learning Representations, 2021
12th OPT Workshop on Optimization for ML
[Best student paper]
-
Scalable Gradients for Stochastic Differential Equations
Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, David Duvenaud
International Conference on Artificial Intelligence and Statistics, 2020
2nd Symposium on Advances in Approximate Bayesian Inference
[Spotlight presentation]
[code]
[slides]
-
Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond
Xuechen Li, Denny Wu, Lester Mackey, Murat A. Erdogdu
Advances in Neural Information Processing Systems, 2019
[Spotlight presentation]