Notes-to-self on finding and fixing GPU kernel bugs.

Tools

  • compute-sanitizer — the first thing to reach for: catches out-of-bounds accesses, races, and uninitialized memory.
  • cuda-gdb — step through kernels; set breakpoints per-thread.
  • Nsight Compute / Systems — profiling and occupancy analysis.

Habits that save hours

  • Check cudaGetLastError() after every launch during development.
  • Reproduce on the smallest grid that still fails.
  • Suspect memory before suspecting math.

Related: wsl2-setup, simd-integer-arithmetic.