About

I have five years of experience primarily working in backend systems, and played a central role in the overall technological success at Brownells during a pivotal period when it came to modernization, scalability, and product development for millions of customers and many internal teams. I took the decision to leave my role in 2025 to create a path for myself to pivot into AI/ML engineering, focusing on a strong understanding of modern AI built from first principles.

Fullstack competency

Modern AI engineering is multifaceted:

  • Optimal compute serving & distributed systems
  • Model architecture & training
  • Agents, harnesses, and reliability
  • Evaluations & optimizations at every step

Today, the best engineers are expected to reason across multiple layers and work efficiently with different teams. Whether it be tracing a broken eval down to a data-mix decision, or a flaky agent back up to a sampling parameter.

The projects below map onto each layer the same way, tied to key developments from 2018 to 2026. Along this journey, I have contributed to open source for tens of thousands of users which includes frontier labs.

Projects

  • Reinforcement learning with vision language models

    Current RL · VLM · Agents

    Currently in progress — teaching Qwen 3.6 to solve captchas with a custom browser harness and RL gym. Along with other agentic browser-based tasks. Future plans for coordinated, multi agent web scraping.

  • Lambda Cloud MCP

    MCP · ML infra · Agents

    MCP surface for Lambda Cloud — auto-provision GPUs, orchestrate ML environments and training jobs, steer agents by text message. Code

  • Legal Agentic RAG System

    RAG · Agents

    Agentic RAG on 250k+ legal pages with hybrid search (bge-m3 + BM25), reranking, multi-agent system & orchestration. ~6,000 HuggingFace dataset downloads. Code · Training infra · Writeup

  • LoRA Targeting for Persona SFT

    Fine-tuning · MoE · VLM

    Gradient-guided LoRA targeting for persona SFT on qwen3.5-35b-a3b; lmms-eval OSS contribution (eval harness used by frontier labs & thousands of users). Accelerated evals by 30x with vLLM. Code · Writeup

  • Llama 3.1 8B Instruction Tuning

    Fine-tuning

    LoRA SFT with +52% IFEval improvement (200→305/834) on ~$10 compute, and multiple quantization experiments. Code

  • Large Language Model Pretraining

    Pretraining · DDP

    Pretrained a 450M-param transformer on FineWeb-Edu (10B tokens), 8xA100 gpus using DDP — RoPE, SwiGLU, Flash attention, GQA/MHA, KV-Cache, etc. Code · Checkpoint

  • Transformer from Scratch

    Fundamentals

    PyTorch "Attention Is All You Need" implementation with encoder-only, decoder-only, and encoder-decoder architectures. Along with modern extensions (SwiGLU, RMSNorm, RoPE) Code

Writing

Technical Writeup

LoRA Targeting for Discord-style SFT

Deep-dive on gradient attribution-guided LoRA targeting, dataset curation from raw Discord exports, training/eval loops, and checkpoint selection for authentic persona style transfer without catastrophic forgetting.

Technical Writeup

Agentic RAG with production level accuracy

End-to-end design notes covering hybrid retrieval architecture, embedding strategies, reranker integration, agentic orchestration, and lessons learned building a RAG pipeline on real legal documents with production level accuracy.

Open to Opportunities

Looking to join teams building interesting AI systems.