87
207
409
Mentions
@
|
Stars | Project | Description |
---|---|---|---|
112 | 7,191 | A massively parallel, optimal functional runtime in Rust | |
5 | 18,963 | LLM training in simple, raw C/CUDA | |
5 | 1,041 | Tile primitives for speedy kernels | |
1 | 874 | CUDA accelerated rasterization of gaussian splatting | |
1 | 195 | CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups | |
1 | 139 | CUDA checkpoint and restore utility | |
1 | 25 | The simplest but fast implementation of matrix multiplication in CUDA. | |
1 | 1 | 3D Gaussian Splatting in JAX |
Popular Cuda Topics
Latest Mentions
Latest mentioned Cuda repos
Stars | Project |
---|---|
7,191 | HVM |
1,041 | ThunderKittens |
1 | jaxsplat |
25 | simpleGEMM |
18,963 | llm.c |
195 | CGBN |
874 | gsplat |
139 | cuda-checkpoint |
6 | cuda-1brc |
294 | dietgpu |
438 | flash-attention-minimal |
0 | blog-code |
623 | raft |
5 | tuna |
287 | NATTEN |
5 | build-nccl-tests-with-pytorch |
24 | GPUODEBenchmarks |
194 | RWKV-CUDA |
182 | causal-conv1d |
67 | ABMGPU |
Latest Discoveries
Latest discovered Cuda repos
Stars | Project |
---|---|
1 | jaxsplat |
1,041 | ThunderKittens |
25 | simpleGEMM |
195 | CGBN |
874 | gsplat |
139 | cuda-checkpoint |
6 | cuda-1brc |
18,963 | llm.c |
438 | flash-attention-minimal |
0 | blog-code |
5 | tuna |
287 | NATTEN |
5 | build-nccl-tests-with-pytorch |
24 | GPUODEBenchmarks |
182 | causal-conv1d |
67 | ABMGPU |
7 | gpu-desktop-calculator |
57 | gdlog |
22 | Harmonia_for_B_plus_trees |
0 | MandelbrotExplorer |
Recently updated posts
-
Jaxsplat: 3D Gaussian Splatting for Jax
-
Welcome to the Parallel Future of Computation
-
Bend a Parallel Language
-
Bend: A higher order language for the GPU
-
ThunderKittens: Tile Primitives for Speedy Kernels