Modern AI engineering is multifaceted:
- Compute and distributed systems
- Model architecture, pretraining, and post-training
- Tooling & environment harnessing for creating usable agents
- Agent reliability in real environments
- Evaluations & optimizations at every step
Today, the best engineers are expected to reason across multiple layers and work efficiently with different teams. Whether it be tracing a broken eval down to a data-mix decision, or a flaky agent back up to a sampling parameter.
The projects below map cleanly onto each layer with this idea in mind, building on years of key developments in the space from 2018 to 2026. From basic distributed gpu pretraining on my own transformer implementation, to instruction tuning with LoRA/PEFT, agentic RAG, and most recently building & training multi-modal agents complete with a custom environment harness, efficient inference serving, high-quality seed data gathering and synthetic data generation.