Deqing Fu

This is Deqing Fu and I’m a fourth-year Ph.D. candidate in Computer Science at the University of Southern California (USC). My main research interests are deep learning theory, natural language processing, and the interpretability of AI systems. I’m (co-)advised by Prof. Vatsal Sharan of USC Theory Group and Prof. Robin Jia of Allegro Lab within USC NLP Group, and I’m working closely with Prof. Mahdi Soltanolkotabi and Prof. Shang-Hua Teng. During my Ph.D. studies, I spent time at Google and Meta as a student researcher. Before USC, I completed my undergraduate degree in Mathematics (with honors) and my master’s in Statistics at the University of Chicago.

My research focuses on understanding large language models from algorithmic and theoretical perspectives, as well as developing practical methods in interpretability, synthetic data generation, and multimodal learning. You can find my publications on Google Scholar and my recent CV here.

Algorithmic Perspectives on Large Language Models

Can Transformers learn algorithms simply from data? (NeurIPS 2024, ICML 2026)
Arithmetic in pretrained LLMs: memorization vs. mechanisms? (NeurIPS 2024, ICLR 2026)
What distinguishes Transformers from other architectures? (ICLR 2025)

Interpretability and Alignment

Decision theory for LLM reasoning under uncertainty (ICLR 2025 Spotlight, arXiv 2026)
Steering vectors for improved visual understanding (ACL 2026), and for efficient and privacy-preserving synthetic data generation (ICML 2026)
Mechanistic interpretability via SAEs and transcoders (arXiv 2025, Tech Report)

Multimodal Models and Applications

Multimodal rewards for improving generation quality: token-level hallucination reduction (ICLR 2025) and Text-to-Image alignment (NAACL 2025)
Modality sensitivity in Multimodal LLMs (COLM 2024)
Large-scale dataset for visual reasoning with images (ICLR 2026)

News

Apr 30, 2026	Two papers (EPSVec and Transformers Learn Graph Connectivity) accepted to ICML 2026.
Apr 22, 2026	New preprint: Convergent Evolution: How Different Language Models Learn Similar Number Representations. See the website and blog post.
Jan 31, 2026	New preprint: EPSVec: Efficient and Private Synthetic Data Generation via Dataset Vectors.
Jan 26, 2026	Two papers (FoNE and Zebra-CoT) accepted to ICLR 2026.
Oct 22, 2025	New preprint: Transformers Provably Learn Algorithmic Solutions for Graph Connectivity, But Only with the Right Data.

Selected Publications

See full list or Google Scholar for all publications.

2026

ICML

Transformers Provably Learn Algorithmic Solutions for Graph Connectivity, But Only with the Right Data

Qilin Ye^*, Deqing Fu^*, Robin Jia, and Vatsal Sharan

In International Conference on Machine Learning (ICML), 2026

*Equal Contribution

📄 Paper PDF
arXiv

Convergent Evolution: How Different Language Models Learn Similar Number Representations

Deqing Fu, Tianyi Zhou, Mikhail Belkin, Vatsal Sharan, and Robin Jia

In arXiv, 2026

📄 Paper Blog Website
ICLR

Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning

Ang Li^*, Charles Wang^*, Deqing Fu^*, Kaiyu Yue^*, Zikui Cai^*, Wang Bill Zhu^*, Ollie Liu^* , Peng Guo^*, Willie Neiswanger, Furong Huang, Tom Goldstein, and Micah Goldblum

In International Conference on Learning Representations (ICLR), 2026

*Equal Contribution

📄 Paper Dataset
ICLR

FoNE: Precise Single-Token Number Embeddings via Fourier Features

Tianyi Zhou, Deqing Fu, Mahdi Soltanolkotabi, Robin Jia, and Vatsal Sharan

In International Conference on Learning Representations (ICLR), 2026

📄 Paper Website
ACL

Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models

Woody Haosheng Gan^*, Deqing Fu^*, Julian Asilis^*, Ollie Liu^*, Dani Yogatama, Vatsal Sharan, Robin Jia, and Willie Neiswanger

In Association of Computational Linguistics (ACL), 2026

*Equal Contribution

📄 Paper Blog

2025

ICLR

TLDR: Token-Level Detective Reward Model for Large Vision Language Models

Deqing Fu, Tong Xiao , Rui Wang, Wang Zhu, Pengchuan Zhang, Guan Pang, Robin Jia, and Lawrence Chen

In International Conference on Learning Representations (ICLR), 2025

📄 Paper
ICLR

Transformers Learn Low Sensitivity Functions: Investigations and Implications

Bhavya Vasudeva^*, Deqing Fu^*, Tianyi Zhou, Elliot Kau , You-Qi Huang, and Vatsal Sharan

In International Conference on Learning Representations (ICLR), 2025

*Equal Contribution

📄 Paper
ICLR

DeLLMa: Decision Making Under Uncertainty with Large Language Models

Ollie Liu^*, Deqing Fu^*, Dani Yogatama, and Willie Neiswanger

In International Conference on Learning Representations (ICLR), 2025

Spotlight (Top 5.1%), *Equal Contribution

📄 Paper Code Website
NAACL

DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback

Jiao Sun^*, Deqing Fu^*, Yushi Hu^* , Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd Steenkiste, Ranjay Krishna, and Cyrus Rashtchian

In Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2025

*Equal Contribution

📄 Paper

2024

NeurIPS

Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression

Deqing Fu, Tian-Qi Chen, Robin Jia, and Vatsal Sharan

In Conference on Neural Information Processing Systems (NeurIPS), 2024

SoCalNLP Symposium 2023 Best Paper Award

📄 Paper Code
NeurIPS

Pre-trained Large Language Models Use Fourier Features to Compute Addition

Tianyi Zhou, Deqing Fu, Vatsal Sharan, and Robin Jia

In Conference on Neural Information Processing Systems (NeurIPS), 2024

📄 Paper
COLM

IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations

Deqing Fu^*, Ruohao Guo^*, Ghazal Khalighinejad^*, Ollie Liu^*, Bhuwan Dhingra, Dani Yogatama, Robin Jia, and Willie Neiswanger

In Conference on Language Modeling (COLM), 2024

*Equal Contribution

📄 Paper Website