CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers

device → threadblock Relation

File in include/cutlass/gemm/deviceIncludes file in include/cutlass/gemm/threadblock
device/gemm_batched.hgemm/threadblock/threadblock_swizzle.h
device/gemm_splitk_parallel.hgemm/threadblock/threadblock_swizzle.h
include/cutlass/gemm/device/gemm.hgemm/threadblock/threadblock_swizzle.h
include/cutlass/gemm/device/gemm_complex.hgemm/threadblock/threadblock_swizzle.h