Blog Posts

Posts per page:

Creating Custom LLMs with Hugging Face

4/2/2025

Creating our own llm and integrating it with Hugging Face.

2/4/2025

Introduction and implementation for optimizing RoPE kernel in CUDA.

1/27/2025

SoftMax Kernel in CUDA beats PyTorch

1/24/2025

Forward pass for Flash Attention

12/8/2024

Math and Code for RWKV 5.2 . More to come