Projects

Projects

Engineering projects, experiments, and systems work.

Vectorizing 64-bit Integer Division

Emulating 64-bit integer division using 32-bit SIMD lanes for a measurable speedup.

active
SIMDInteger arithmeticPerformance

Building PyTorch from Source

A reproducible workflow for building PyTorch from source with CUDA on WSL2.

maintained
PyTorchCUDABuild systems