Research

I work on the science of deep learning.

I'm interested in understanding what systems centered around foundation models will look like in 5 years, and how they will touch our lives in unthinkable ways. Reasoning models and computer use/software agents are two nascent examples, but I think and hope that AI systems will play a more transformative role in our day to day lives than mere automation. My research style usually involves studying these models in a scientific way, running experiments across the stack: from pretraining to evals and beyond.

My path into deep learning was a meandering one. I was adopted by two research communities broadly studying artificial intelligence when I was an undergraduate: physicists studying the brain -- who think of neural networks as freshly discovered platonic objects -- and computer systems people who think of them as differentiable graphs nestled inside GPU cores. I am immensely fortunate to have been a part of, and be mentored by, folks in both communities.

Overtrained Langued Models are Harder to Finetune

ICBINB at ICLR 2025.

Best Paper.

SCOPE at ICLR 2025.

Outstanding Paper.

ICML 2025.

Tanishq Kumar

Scaling Laws for Precision

ICLR 2025.

Oral Award.

Tanishq Kumar*

Do Mice Grok? Unveiling Hidden Progress in Sensory Cortex During Overtraining

ICLR 2025.

Tanishq Kumar

Lower Data Diversity Accelerates Training: Case Studies in Synthetic Tasks

Preprint.

Tanishq Kumar*

Asymptotic Dynamics for Delayed Feature Learning on a Toy Model

HiLD at ICML 2024.

Tanishq Kumar

No Free Prune: Information-Theoretic Barriers to Pruning at Initialization

ICML 2024.

Tanishq Kumar*

Grokking as the Transition from Lazy to Rich Training Dynamics

ICLR 2024.

Tanishq Kumar

Human or Machine? Turing Tests for Vision and Language

Preprint.

Tanishq Kumar