An ambiguity reduction engine for AI-driven development. Spec Machine transforms human intent (high entropy) into formalized specs (low entropy), minimizing the iterations needed for a code agent to produce tests that pass on the first try.
Every extra iteration between "what I want" and "tests pass" is unresolved entropy. Spec Machine measures and reduces that entropy by detecting ambiguity in specifications before code generation begins.
Arcasidian → infrastructure to run specialized SLMs
Spec Machine → ambiguity reduction engine (this project)
Arca School → educational platform teaching the methodology
Spec Machine classifies ambiguity into four categories:
| Type | Description |
|---|---|
underspecified |
Missing information needed to implement |
contradictory |
Two parts of the spec contradict each other |
missing_context |
Depends on knowledge not present in the spec |
vague_constraint |
Requirement exists but is not measurable/testable |
spec_quality = 1 / iterations_until_tests_pass
- Perfect spec: score = 1 (tests pass on the first iteration)
- Each extra iteration represents entropy that was not removed
- Language: Rust (edition 2021)
- CLI: clap
- Schema: serde + serde_json
- Storage: JSONL (one JSON per line)
- Baseline evaluation: Python 3.11+
# Build
cargo build --workspace
# Run tests
cargo test --workspace
# Lint
cargo clippy --workspace -- -D warnings
# Format
cargo fmt --all# Annotate a spec
cargo run -p spec-machine-cli -- annotate <spec-file>
# Validate dataset
cargo run -p spec-machine-cli -- validate data/dataset/
# Export statistics
cargo run -p spec-machine-cli -- stats data/dataset/spec-machine/
├── crates/
│ ├── schema/ # Rust types: SpecEntry, AmbiguityLabel, AnnotationSchema
│ └── cli/ # CLI for annotation, validation, and export
├── data/
│ ├── examples/ # Annotated spec examples
│ └── dataset/ # Real dataset (versioned via releases)
├── scripts/
│ ├── baseline_eval.py # Few-shot evaluation with GPT-4/Claude
│ └── stats.py # Dataset statistics
└── docs/ # Architecture Decision Records
Phase 1.0 — Dogfood: Defining the annotation schema, building the CLI, and manually annotating 20-30 real specs from Arcasidian.
MIT