New Frontier Red Team blog: Phase 2 of Project Fetch, where we test how well Claude can program a robodog.
Opus 4.7, on its own, was ~20x faster than last year's best human team aided by Opus 4.1. (The robodog, alas, still failed to fetch a beach ball.)
