Research

I work on the science of deep learning. At the moment, I am temporarily less focused on publishing research and more on open source deep learning.

I'm interested in understanding what systems centered around foundation models will look like in 5 years, and how they will touch our lives in unthinkable ways. Reasoning models and computer use/software agents are two nascent examples, but I think and hope that AI systems will play a more transformative role in our day to day lives than mere automation. My research style usually involves studying these models in a scientific way, running experiments across the stack: from pretraining to evals and beyond.

My path into deep learning was a meandering one. I was adopted by multiple research communities broadly studying artificial intelligence when I was an undergraduate. Two prominent ones were physicists studying the brain -- who think of neural networks as freshly discovered platonic objects -- and computer systems people who like to hack and think of neural networks as graphs of differentiable tensor operations nestled inside GPU cores. I am immensely fortunate to have been a part of, and be mentored by, folks in both communities.