About

About me

I'm a software engineer and computer architect with 10+ years of experience in memory and computing system architecture, parallel programming, and system resilience — with a recent focus on LLM inference and training.

Today

I'm a Senior Software Engineer at Microsoft (Azure Hardware Architecture, AI Frameworks), where I build kernels, runtime libraries, and the LLM serving stack for the Maia ASIC accelerators. I designed Maia's host/device programming model, delivered core SDK components, integrated Maia into PyTorch and ONNX Runtime, and partnered with OpenAI to ship the Maia-powered GitHub Copilot demo at Ignite 2023.

Background

I earned my PhD in Electrical and Computer Engineering from the University of Texas at Austin (advised by Mattan Erez), where I built the Containment Domains resilience runtime for high-performance and GPU-dense computing. Along the way I interned at NVIDIA Research, Intel's Open Source Technology Center, and Lawrence Livermore National Laboratory.

Areas of interest

Professional focus

I care about correctness-first performance work: measurable wins, honest benchmarks, and clear write-ups that others can learn from. This site is both a portfolio and a working notebook of that practice.

Contact

The best ways to reach me: