Building PyTorch from Source

Feb 5, 2026 · maintained

Summary

Notes and scripts for building PyTorch from source with CUDA support, primarily targeting a WSL2 development environment.

Motivation

Working on PyTorch internals — custom kernels, autograd tweaks, profiling — means you can’t rely on the prebuilt wheels. A fast, reproducible source build loop is essential.

Technical Details

Pinning CUDA toolkit + driver versions that actually match.
USE_CUDA=1, MAX_JOBS, and ccache for sane incremental builds.
Editable installs so Python-side changes don’t trigger a full rebuild.

See the related notes on WSL2 setup and CUDA debugging.

Results

Cold build and warm incremental build times are tracked locally; ccache turns a 40-minute rebuild into seconds for header-only changes.

Lessons Learned