Ankit Dahal - AI/ML Engineer

About

I have five years of experience primarily working in backend systems, and played a central role in the overall technological success at Brownells during a pivotal period when it came to modernization, scalability, and product development for millions of customers and many internal teams. I took the decision to leave my role in 2025 to create a path for myself to pivot into AI/ML engineering, focusing on a strong understanding of modern AI built from first principles.

Fullstack competency

Modern AI engineering is multifaceted:

Optimal compute serving & distributed systems
Model architecture & training
Agents, harnesses, and reliability
Evaluations & optimizations at every step

Today, the best engineers are expected to reason across multiple layers and work efficiently with different teams. Whether it be tracing a broken eval down to a data-mix decision, or a flaky agent back up to a sampling parameter.

The projects below map onto each layer the same way, tied to key developments from 2018 to 2026. Along this journey, I have contributed to open source for tens of thousands of users which includes frontier labs.

Projects

Reinforcement learning with vision language models
Current RL · VLM · Agents

Currently in progress — teaching Qwen 3.6 to solve captchas with a custom browser harness and RL gym. Along with other agentic browser-based tasks. Future plans for coordinated, multi agent web scraping.
Lambda Cloud MCP
MCP · ML infra · Agents

MCP surface for Lambda Cloud — auto-provision GPUs, orchestrate ML environments and training jobs, steer agents by text message. Code
Legal Agentic RAG System
RAG · Agents

Agentic RAG on 250k+ legal pages with hybrid search (bge-m3 + BM25), reranking, multi-agent system & orchestration. ~6,000 HuggingFace dataset downloads. Code · Training infra · Writeup
LoRA Targeting for Persona SFT
Fine-tuning · MoE · VLM

Gradient-guided LoRA targeting for persona SFT on qwen3.5-35b-a3b; lmms-eval OSS contribution (eval harness used by frontier labs & thousands of users). Accelerated evals by 30x with vLLM. Code · Writeup
Llama 3.1 8B Instruction Tuning
Fine-tuning

LoRA SFT with +52% IFEval improvement (200→305/834) on ~$10 compute, and multiple quantization experiments. Code
Large Language Model Pretraining
Pretraining · DDP

Pretrained a 450M-param transformer on FineWeb-Edu (10B tokens), 8xA100 gpus using DDP — RoPE, SwiGLU, Flash attention, GQA/MHA, KV-Cache, etc. Code · Checkpoint
Transformer from Scratch
Fundamentals

PyTorch "Attention Is All You Need" implementation with encoder-only, decoder-only, and encoder-decoder architectures. Along with modern extensions (SwiGLU, RMSNorm, RoPE) Code

Writing

Technical Writeup

LoRA Targeting for Discord-style SFT

Deep-dive on gradient attribution-guided LoRA targeting, dataset curation from raw Discord exports, training/eval loops, and checkpoint selection for authentic persona style transfer without catastrophic forgetting.

Technical Writeup

Agentic RAG with production level accuracy

End-to-end design notes covering hybrid retrieval architecture, embedding strategies, reranker integration, agentic orchestration, and lessons learned building a RAG pipeline on real legal documents with production level accuracy.

Open to Opportunities

Looking to join teams building interesting AI systems.

View Resume GitHub LinkedIn

About

Fullstack competency

Projects

Reinforcement learning with vision language models

Lambda Cloud MCP

Legal Agentic RAG System

LoRA Targeting for Persona SFT

Llama 3.1 8B Instruction Tuning

Large Language Model Pretraining

Transformer from Scratch

Writing

LoRA Targeting for Discord-style SFT

Agentic RAG with production level accuracy

Open to Opportunities