Ankit Dahal

Building AI systems from first principles

Engineer focused on LLMs: pretraining, post-training, fine-tuning, and optimizations.

Current Focus

Learning Reasoning, reinforcement learning, RLVR, reward modeling
Recently finished: Instruction tuning with LoRA, quantization experiments, IFEval benchmarking

Projects

Legal RAG System

RAG / Agents

Production-grade agentic RAG system for legal documents. Full pipeline from data sourcing through retrieval, reranking, and agent orchestration. Hybrid search combining dense embeddings (bge-m3) with sparse retrieval (BM25).

bge-m3 ChromaDB Elasticsearch bge-reranker Gemini-2.5 Docker GKE

Embeddings: bge-m3, gemini-embedding-001

Indexing: ChromaDB (HNSW), Elasticsearch (BM25)

Retrieval: Hybrid search with RRF, convex combination

Agents: Conversational + search agents, planning, self-triage

Training: 560M parameter model fine-tuning

Full code available on request

Transformer from Scratch

Fundamentals

Clean PyTorch implementation of "Attention Is All You Need" for deep understanding of transformer mechanics. Extended with modern architectural improvements used in current LLMs.

PyTorch Original sinusoidal positional encoding SwiGLU/ReLU MHA LayerNorm, RMSNorm

Components: encoder/decoder blocks, multi-head attention, positional encoding, layer normalization

LLM VRAM Calculator

Tooling

Comprehensive tool for estimating GPU memory requirements for LLM training and inference. Supports dense transformers and MoE architectures with detailed breakdowns of weights, gradients, optimizer states, activations, and KV cache.

Gradio HuggingFace MoE Mixed Precision

Open to Opportunities

Looking to join teams building interesting AI systems.