7/26/2025
The math behind alignment of llms and how we can build our own method
4/2/2025
Creating our own llm and integrating it with Hugging Face.
2/4/2025
Introduction and implementation for optimizing RoPE kernel in CUDA.
1/27/2025
SoftMax Kernel in CUDA beats PyTorch
1/24/2025
Forward pass for Flash Attention