About
I have five years of experience primarily working in backend systems, and played a central role in the overall technological success at Brownells during a pivotal period when it came to modernization, scalability, and product development for millions of customers and many internal teams. I took the decision to leave my role in 2025 to create a path for myself to pivot into AI/ML engineering, focusing on a strong understanding of modern AI built from first principles.
Fullstack competency
Modern AI engineering is multifaceted:
- Optimal compute serving & distributed systems
- Model architecture & training
- Agents, harnesses, and reliability
- Evaluations & optimizations at every step
Today, the best engineers are expected to reason across multiple layers and work efficiently with different teams. Whether it be tracing a broken eval down to a data-mix decision, or a flaky agent back up to a sampling parameter.
The projects below map onto each layer the same way, tied to key developments from 2018 to 2026. Along this journey, I have contributed to open source for tens of thousands of users which includes frontier labs.
Projects
-
Reinforcement learning with vision language models
CurrentCurrently in progress — teaching Qwen 3.6 to solve captchas with a custom browser harness and RL gym. Along with other agentic browser-based tasks. Future plans for coordinated, multi agent web scraping.
-
Lambda Cloud MCP
MCP surface for Lambda Cloud — auto-provision GPUs, orchestrate ML environments and training jobs, steer agents by text message. Code
-
Legal Agentic RAG System
Agentic RAG on 250k+ legal pages with hybrid search (bge-m3 + BM25), reranking, multi-agent system & orchestration. ~6,000 HuggingFace dataset downloads. Code · Training infra · Writeup
-
LoRA Targeting for Persona SFT
Gradient-guided LoRA targeting for persona SFT on qwen3.5-35b-a3b; lmms-eval OSS contribution (eval harness used by frontier labs & thousands of users). Accelerated evals by 30x with vLLM. Code · Writeup
-
Llama 3.1 8B Instruction Tuning
LoRA SFT with +52% IFEval improvement (200→305/834) on ~$10 compute, and multiple quantization experiments. Code
-
Large Language Model Pretraining
Pretrained a 450M-param transformer on FineWeb-Edu (10B tokens), 8xA100 gpus using DDP — RoPE, SwiGLU, Flash attention, GQA/MHA, KV-Cache, etc. Code · Checkpoint
-
Transformer from Scratch
PyTorch "Attention Is All You Need" implementation with encoder-only, decoder-only, and encoder-decoder architectures. Along with modern extensions (SwiGLU, RMSNorm, RoPE) Code
Writing
LoRA Targeting for Discord-style SFT
Deep-dive on gradient attribution-guided LoRA targeting, dataset curation from raw Discord exports, training/eval loops, and checkpoint selection for authentic persona style transfer without catastrophic forgetting.
Agentic RAG with production level accuracy
End-to-end design notes covering hybrid retrieval architecture, embedding strategies, reranker integration, agentic orchestration, and lessons learned building a RAG pipeline on real legal documents with production level accuracy.
Open to Opportunities
Looking to join teams building interesting AI systems.