fix: bump idna + starlette to patched versions by constk · Pull Request #103 · constk/harness-python-react

constk · 2026-05-25T13:46:34Z

What & why

pip-audit on develop flags two transitive-dep CVEs (surfaced via fastapi / httpx):

Package	Current	Fixed in	Advisory
`idna`	3.13	3.15+	CVE-2026-45409
`starlette`	1.0.0	1.0.1+	PYSEC-2026-161

Bumped via:

uv lock --upgrade-package idna --upgrade-package starlette

Resolves to idna 3.16 (3.15 was the listed fix; 3.16 is a further patch on the same line) and starlette 1.1.0 (minor bump). The currently pinned fastapi 0.136.1 accepts Starlette 1.1.x — confirmed locally by running the unit suite. All 192 unit tests pass on the upgraded lock; pip-audit then returns "No known vulnerabilities found".

Also bumps the project self-version 0.2.10 → 0.2.11 per docs/DEVELOPMENT.md.

Why this is its own PR. Code review on #99 (the README opener PR) recommended landing the CVE bumps separately so the dep-bump risk doesn't pile onto a docs-only change. PRs #99, #100, #101, #102/#105 all currently inherit these CVEs from develop and cannot pass pip-audit until this PR lands — so this is the unblocking commit for the whole release-blockers stack.

Test plan

uv lock --upgrade-package idna --upgrade-package starlette regenerates the lock cleanly
uv sync --frozen --extra dev succeeds with the new lock
uv run pytest tests/ -q → 192 passed (fastapi 0.136.1 + starlette 1.1.0 compatible)
uv run pip-audit → "No known vulnerabilities found"
CI pip-audit gate passes

Invariants affected

None.

New deps / actions / external surface

No new direct deps; two transitive deps bumped. starlette jumps minor (1.0.0 → 1.1.0) — verified compatible with the pinned fastapi 0.136.1.

Linked issue

None — surfaced by the #99 code review.

pip-audit on develop is flagging two transitive-dep CVEs: - idna 3.13 CVE-2026-45409 (fix in 3.15+) - starlette 1.0.0 PYSEC-2026-161 (fix in 1.0.1+) Both are surfaced via fastapi/httpx. Bumps via: uv lock --upgrade-package idna --upgrade-package starlette Resolves to idna 3.16 (3.15 was the listed fix; 3.16 is a further patch with the same fix) and starlette 1.1.0 (minor bump; FastAPI is compatible with it). All 192 unit tests pass on the upgraded lock. Bumps the project self-version 0.2.10 -> 0.2.11 per docs/DEVELOPMENT.md. Unblocks the pip-audit CI gate on #99, #100, #101, #102 (and any other PRs currently sitting on develop), all of which inherit the flagged transitive CVEs from develop and cannot pass that gate until this lands.

* feat: eval pattern examples calling Azure OpenAI (#94) The eval slice previously shipped one toy case (echo-hello) and a disabled-by-default nightly. A reader expecting an LLM-eval story found the infrastructure without conviction. Adds four worked-pattern cases that exercise the existing three tolerance modes against a real Azure OpenAI deployment. These are not benchmarks — they demonstrate what an eval case *looks like* for the four LLM-eval patterns you most often need to write: - factual-http-200 exact_match format-constrained recall - numeric-seconds-per-day numeric_close numeric reasoning + tolerance - definitional-fastapi-depends semantic_similar free-form judge-scored prose - structured-json-status exact_match structured-output adherence When the template is forked for a real project, replace these four with cases that exercise the project's own prompts; the patterns transfer regardless of what product is bolted on. Provider choice — Azure OpenAI via the openai SDK with AzureOpenAI client — is intentionally distinct from the rest of the harness (which uses Claude via Claude Code). Demonstrates that the LLMClient Protocol in src/eval/judge.py does its job: the eval core never imports openai, vendor lock-in lives only in the adapter. Changes: - src/eval/adapters/azure_openai.py — implements LLMClient via the openai.AzureOpenAI SDK. Reads endpoint/key/deployment/api-version from env. Lazy-imports the SDK so the module is importable without the optional extra installed; the adapter raises a clear AzureOpenAIConfigError if the env or SDK is missing. - eval/golden_patterns.json — the four cases with notes explaining which pattern each demonstrates. - eval/test_golden_patterns.py — separate test file gated on the Azure env vars via pytestmark. Skipped on a stock checkout, so `uv run pytest eval/` always exits 0. The toy test_golden_qa.py keeps running as before. - pyproject.toml — new optional [project.optional-dependencies] eval extra (just `openai>=1.40.0`), mypy override for openai.* matching the existing opentelemetry.* pattern, and a 0.2.10 -> 0.2.11 self-version bump. - .github/workflows/eval-nightly.yml — env vars renamed from the placeholder LLM_* set to AZURE_OPENAI_*. Header comment updated with the Azure setup recipe. uv sync now passes --extra eval. - docs/EVAL_HARNESS.md — new "Worked patterns" section with the table mapping case -> tolerance -> pattern, the local setup recipe, and a "Swapping providers" note documenting the Protocol-based extension path. Local gates: mypy --strict clean on 42 source files (was 31), ruff clean, ruff format clean, import-linter both contracts kept, 192 unit tests pass, eval/ runs 1 passed + 4 skipped without LLM env. Closes #94 * test: add adapter unit tests + adapters README (#94 review fixes) Addresses two gate failures on #104 surfaced by code review: 1. "Tests required" gate — feat: prefix declared a behaviour change but tests/ had no test for the new adapter (the eval/-side test only runs with live Azure credentials). Adds tests/test_eval_azure_openai_adapter.py: 13 fully-offline cases covering _resolve_config (defaults, override, empty-string fallback, missing-env error listing), the constructor (env wiring, explicit API version, missing-env, missing-SDK), and the two SDK call paths (complete_json structured-output mode, complete user-message dispatch, null-content returns "" / "{}"). The SDK is mocked at sys.modules level so the test never hits the network and never requires the openai extra to be installed. 2. "src/ README audit" gate — every src/ package needs a README.md per CLAUDE.md. Adds src/eval/adapters/README.md documenting the layer's purpose, the current adapter, a 7-step "adding a new adapter" recipe, and why the layer lives at the top of the import order. Also applies the reviewer's non-blocking sentinel-string suggestion: the magic "azure-deployment" string passed as judge_model in eval/test_golden_patterns.py is now the named constant _AZURE_DEPLOYMENT_SENTINEL with a comment explaining why the runner threads it through but the Azure adapter discards it. Local gates: 205 unit tests pass (was 192, +13 new), mypy clean on 43 source files, ruff/format/import-linter all green. Refs #94 * docs: add Key interfaces section to adapters README (#94 review) src/ README audit gate looks for a `## Key interfaces` (or `## Public surface`) anchor — the existing README had purpose / table / extension recipe / layering rationale, but no exported-names section. Adds a `## Key interfaces` section listing the two exported names: - AzureOpenAIClient — the LLMClient implementation with notes on complete() vs complete_json() and the discarded `model` arg (Azure dispatches by deployment, not model). - AzureOpenAIConfigError — the construction-time error type, noting that it batches every missing env var into a single message instead of failing-and-retrying. Both already documented in the adapter docstrings; this section hoists them to the README anchor the audit gate enforces. Refs #94 * chore: bump version to 0.2.12 (rebase onto develop after #103)

…sed post-#103/#104) main moved ahead of develop on 2026-05-25 when PR #86 was merged directly to main rather than via develop -> release flow. The divergence is one squash commit (eff5b1c) carrying: - docs/BEADS.md (optional Beads issue-queue guidance) - .github/pull_request_template.md (Beads PR-template block) - .github/scripts/check_aspirational_tickets.py (PEP 758 reformat) - .github/scripts/check_pin_freshness.py / check_tests_present.py / check_version_bump.py (touch-ups) - .gitattributes / .gitignore (.beads/ ignore, Windows renormalise) - CONTRIBUTING.md (line-ending normalisation) - tests/test_scripts_compile.py (new CI-script compile gate) - docs/DEVELOPMENT.md / docs/HARNESS.md / docs/HARNESS_PRIMER.md cross-refs - pyproject.toml + uv.lock self-version 0.2.10 -> 0.2.11 This PR was rebased after #103 (CVE fix, develop -> 0.2.11) and #104 (eval pattern examples, develop -> 0.2.12) merged. The version on main (0.2.11) is now behind develop (0.2.12); the conflict is resolved by bumping develop -> 0.2.13. After this lands, develop is at 0.2.13 and contains everything main has. Remaining in-flight PRs (#99, #100, #101, #105) need to rebase to bump 0.2.13 -> 0.2.14 (and onward sequentially as they merge). No behaviour change beyond what #86 already added to main. # Conflicts: # pyproject.toml # uv.lock

constk mentioned this pull request May 25, 2026

chore: align develop with main — backport #86 content + version #106

Merged

5 tasks

constk merged commit d256e32 into develop May 26, 2026
22 checks passed

constk added a commit that referenced this pull request May 26, 2026

chore: bump version to 0.2.12 (rebase onto develop after #103)

1a32080

This was referenced May 26, 2026

release: bring main up to develop (0.2.17 — release-readiness docs + eval pattern examples + transitive CVE patches) #107

Closed

release: bring main up to develop (0.2.17 — release-readiness docs + eval pattern examples + transitive CVE patches) #108

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: bump idna + starlette to patched versions#103

fix: bump idna + starlette to patched versions#103
constk merged 1 commit into
developfrom
fix/cve-bumps-idna-starlette

constk commented May 25, 2026 •

edited

Loading

Uh oh!

Labels

1 participant

Conversation

constk commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What & why

Test plan

Invariants affected

New deps / actions / external surface

Linked issue

Uh oh!

Labels

1 participant

constk commented May 25, 2026 •

edited

Loading