I work at an early-stage startup. Previously, I was a senior researcher at Microsoft Research. And before that, I was a Google research fellow at the Simons Institute (UC Berkeley).

I work on language models (especially the Phi series of language models) and diffusion models theory. Before, my research was mostly focused on sampling, optimization and proximal methods. Here is a paper at the intersection of these topics.

CV

News

Some papers

LLMs

Diffusion models