4/2/2025
Creating our own llm and integrating it with Hugging Face.
2/4/2025
Introduction and implementation for optimizing RoPE kernel in CUDA.
1/27/2025
SoftMax Kernel in CUDA beats PyTorch
1/24/2025
Forward pass for Flash Attention
12/8/2024
Math and Code for RWKV 5.2 . More to come