About

I have five years of experience primarily working in backend systems, and played a central role in the overall technological success at Brownells during a pivotal period when it came to modernization, scalability, and product development for millions of customers and many internal teams. I took the decision to leave my role in 2025 to create a path for myself to pivot into AI/ML engineering, focusing on a strong understanding of modern AI built from first principles.

Fullstack competency

Modern AI engineering is multifaceted:

  • Compute and distributed systems
  • Model architecture, pretraining, and post-training
  • Tooling & environment harnessing for creating usable agents
  • Agent reliability in real environments
  • Evaluations & optimizations at every step

A common idea expressed by leaders in AI is that the best engineers can reason across multiple layers and work efficiently with different teams. Whether it be tracing a broken eval down to a data-mix decision, or a flaky agent back up to a sampling parameter.

The projects below map cleanly onto each layer with this idea in mind, building on years of key developments in the space from 2018 to 2026. From basic distributed gpu pretraining on my own transformer implementation, to instruction tuning with LoRA/PEFT, agentic RAG, and most recently building & training multi-modal agents complete with a custom environment harness, efficient inference serving, high-quality seed data gathering and synthetic data generation.

Projects

Browser use agents & harnessing

Feb 2026 – Current

Training browser use agents to complete web tasks and captcha solving

View details

Training browser use agents to complete web tasks and captcha solving, where modern models have a near 0% success rate on difficult captcha tasks.

Technology: Qwen 3.5 (multi-modal/VLM), playwright browser harness, sft & reinforcement learning, python

Llama 3.1 8B Instruction Tuning

Fine-tuning Jan 2026 – Feb 2026

LoRA fine-tuning with +52% IFEval improvement using ~$10 compute.

View details

LoRA fine-tuning of Llama 3.1 8B for instruction following. Achieved +52% IFEval improvement (200 → 305/834) with ~$10 of compute across 4-bit, 8-bit, and BF16 quantization experiments.

+52% IFEval gain
~$10 Compute
8B Base model
LoRA Unsloth Quantization IFEval W&B

Large Language Model Pretraining

Pretraining Dec 2025 – Jan 2026

450M parameter transformer inspired by the gpt and llama papers, pretrained on 10B tokens using 8xA100 GPUs.

View details

End-to-end pretraining of a custom, self-built 450M parameter transformer on FineWeb-Edu (10B tokens). Implementation was inspired by the original GPT and llama papers. Trained on 8xA100 GPUs using Distributed Data Parallel with custom training loops and memory optimizations.

450M Parameters
10B Tokens
8x A100 GPUs
PyTorch, DDP Custom training loop Chinchilla scaling law RoPE, RMSNorm, SwiGLU, GQA/MHA, KV cache Flash attention

Transformer from Scratch

Fundamentals Nov 2025 – Dec 2025

Clean PyTorch implementation with modern LLM improvements.

View details

Clean PyTorch implementation of "Attention Is All You Need," extended with modern LLM architectural improvements: RoPE, SwiGLU, RMSNorm.

PyTorch MHA SwiGLU/ReLU RMSNorm

Legal RAG System

RAG / Agents Jul 2025 – Oct 2025

Production-grade agentic RAG for legal documents with hybrid search (semantic + lexical).

View details

Production-grade agentic RAG system for legal documents. Full pipeline from data sourcing through retrieval, reranking, and agent orchestration. Hybrid search combining semantic embeddings (bge-m3) with lexical retrieval (BM25). Web search included. Strong evaluation performance across retrieval, agentic search, citations/trustworthiness, and more. Evaluations can be found in the writeup below.

bge-m3 ChromaDB Elasticsearch bge-reranker Gemini-2.5 Docker GKE

Embeddings: bge-m3, gemini-embedding-001

Indexing: ChromaDB (HNSW), Elasticsearch (BM25)

Retrieval: Hybrid search with RRF, convex combination

Agents: Conversational + search agents, planning, self-triage

Training: 560M parameter model fine-tuning

Writing

Technical Writeup

Building a Production-Grade Legal RAG System

End-to-end design notes covering hybrid retrieval architecture, embedding strategies, reranker integration, agentic orchestration, and lessons learned building a RAG pipeline for legal documents.

Open to Opportunities

Looking to join teams building interesting AI systems.