Engineering
Systems and infrastructure I've designed, built, and shipped.
Public and open-source contributions only. Lantern Pharma work and private repos aren't reflected here.
Most of my career has been spent building systems that move biological data from raw to useful. Cloud infrastructure, production pipelines, databases, agentic AI. Across cancer drug discovery, genomics, and edtech.
Good infrastructure disappears. When it’s working, nobody thinks about it. The database responds, the pipeline runs, the agents coordinate, the deployment holds under load. The craft is in building systems that stay invisible, that absorb complexity so the people using them never have to. Most of what I’ve built will never be seen by anyone other than the engineers who maintain it after me. That’s the point.
The research grows out of this work. When you spend a decade building systems that process biological data, the questions about what that data is trying to tell you become impossible to ignore.
"The purpose of computing is insight, not numbers."
— Richard Hamming
withZeta.ai
ActiveLantern Pharma's platform for rare cancer drug discovery. Recursive agentic intelligence across clinical trial registries, molecular databases, biomarker literature, and live research. I joined as Senior Data Engineer and built the data infrastructure from the ground up, then grew into Lead Platform Architect as the agentic layer took shape on top. ZETA is live, onboarding clinical researchers, and actively informing drug synthesis for cancers traditional pharma won't fund.
The Attune Lab
ActiveSelf-hosted research infrastructure running from an 8U desk rack: a Threadripper workstation with dual GPUs for local inference, a Raspberry Pi cluster as agent worker nodes, and a Beelink mini PC for services. Everything orchestrated with k3s, networked over Tailscale, and backed by Postgres. Built to own my own compute, data, and reasoning rather than rent it. The full build is documented in the Building the Lab series.
Open-source pipeline behind "What Lives? A meta-analysis of diverse opinions on the definition of life" (arXiv, 2025). Uses multi-model LLM consensus (Claude, GPT-4o, Llama 3.3) to map the semantic landscape of 68 expert definitions of life through pairwise correlation, clustering, and dimensionality reduction. Designed to be reusable: swap in any set of competing definitions or contested concepts and the same methodology applies. With Michael Levin, Blaise Agüera y Arcas, and Karina Kofman.
Pria
2023Built the Generative Intelligence Engine powering Pria, a generative AI teaching assistant at Praxis AI. Won 2 American Business Awards: GOLD for Achievement in Online Training, BRONZE for AI in EdTech. The Foundations of AI course I built and taught took GOLD for K-12 Learning Management.
Open-source tool for processing massive RNAseq datasets across heterogeneous computational infrastructure: Kubernetes, HPC clusters, and cloud. Built to handle the scale of cancer genomics research that single-machine pipelines can't. Published in BMC Bioinformatics.