About

I have five years of experience primarily working in backend systems, and played a central role in the overall technological success at Brownells when it came to modernization, scalability, and product development for millions of customers and many internal teams. I took the decision to leave my role in 2025 to create a path for myself to pivot into AI/ML engineering, focusing on a strong understanding of modern AI built from first principles.

Fullstack competency

AI engineering is a full stack problem: compute and distributed systems at the base, model architecture and pretraining above it, post-training and evaluation on top of that, retrieval and tools wired in, and agents acting in real environments at the edge. A common idea expressed by frontier AI leadership is that the best engineers, who ship, can reason across multiple layers—tracing a broken eval down to a data-mix decision, a flaky agent back up to a sampling parameter—with the first-principles depth to not just fix it at the layer that’s broken, but improve it as well.

The projects below are arranged to map cleanly onto that stack, building on years of key developments in the space from 2018 to 2026. From basic distributed gpu pretraining on my own transformer implementation, to instruction tuning with LoRA/PEFT, agentic RAG, and most recently building & training multi-modal agents complete with a custom environment harness, efficient inference serving, high-quality seed data gathering and synthetic data generation.

Projects

Browser use agents & harnessing

Feb 2026 – Current

Training browser use agents to complete web tasks and captcha solving

View details

Training browser use agents to complete web tasks and captcha solving, where modern models have a near 0% success rate on difficult captcha tasks.

Technology: Qwen 3.5 (multi-modal/VLM), playwright browser harness, sft & reinforcement learning, python

Llama 3.1 8B Instruction Tuning

Fine-tuning Jan 2026 – Feb 2026

LoRA fine-tuning with +52% IFEval improvement using ~$10 compute.

View details

LoRA fine-tuning of Llama 3.1 8B for instruction following. Achieved +52% IFEval improvement (200 → 305/834) with ~$10 of compute across 4-bit, 8-bit, and BF16 quantization experiments.

+52% IFEval gain
~$10 Compute
8B Base model
LoRA Unsloth Quantization IFEval W&B

Large Language Model Pretraining

Pretraining Dec 2025 – Jan 2026

450M parameter transformer inspired by the gpt and llama papers, pretrained on 10B tokens using 8xA100 GPUs.

View details

End-to-end pretraining of a custom, self-built 450M parameter transformer on FineWeb-Edu (10B tokens). Implementation was inspired by the original GPT and llama papers. Trained on 8xA100 GPUs using Distributed Data Parallel with custom training loops and memory optimizations.

450M Parameters
10B Tokens
8x A100 GPUs
PyTorch, DDP Custom training loop Chinchilla scaling law RoPE, RMSNorm, SwiGLU, GQA/MHA, KV cache Flash attention

Transformer from Scratch

Fundamentals Nov 2025 – Dec 2025

Clean PyTorch implementation with modern LLM improvements.

View details

Clean PyTorch implementation of "Attention Is All You Need," extended with modern LLM architectural improvements: RoPE, SwiGLU, RMSNorm.

PyTorch MHA SwiGLU/ReLU RMSNorm

Legal RAG System

RAG / Agents Jul 2025 – Oct 2025

Production-grade agentic RAG for legal documents with hybrid search (semantic + lexical).

View details

Production-grade agentic RAG system for legal documents. Full pipeline from data sourcing through retrieval, reranking, and agent orchestration. Hybrid search combining semantic embeddings (bge-m3) with lexical retrieval (BM25).

bge-m3 ChromaDB Elasticsearch bge-reranker Gemini-2.5 Docker GKE

Embeddings: bge-m3, gemini-embedding-001

Indexing: ChromaDB (HNSW), Elasticsearch (BM25)

Retrieval: Hybrid search with RRF, convex combination

Agents: Conversational + search agents, planning, self-triage

Training: 560M parameter model fine-tuning

Writing

Technical Writeup

Building a Production-Grade Legal RAG System

End-to-end design notes covering hybrid retrieval architecture, embedding strategies, reranker integration, agentic orchestration, and lessons learned building a RAG pipeline for legal documents.

Open to Opportunities

Looking to join teams building interesting AI systems.