home projects contact reviews blog

Blog Posts

Posts per page:

Alignment introduction in LLMs

7/26/2025

The math behind alignment of llms and how we can build our own method

ReasoningAlignmentLLM

Creating Custom LLMs with Hugging Face

4/2/2025

Creating our own llm and integrating it with Hugging Face.

PyTorchHugginFaceLLM

RoPE Kernel Optimization in CUDA

2/4/2025

Introduction and implementation for optimizing RoPE kernel in CUDA.

CUDARoPEOptimization

CUDA Softmax Kernel

1/27/2025

SoftMax Kernel in CUDA beats PyTorch

CUDASoftMaxOptimization

CUDA Flash Attention Kernel

1/24/2025

Forward pass for Flash Attention

CUDAAttentionOptimization

Tags