Skip to content

fix: bump idna + starlette to patched versions#103

Merged
constk merged 1 commit into
developfrom
fix/cve-bumps-idna-starlette
May 26, 2026
Merged

fix: bump idna + starlette to patched versions#103
constk merged 1 commit into
developfrom
fix/cve-bumps-idna-starlette

Conversation

@constk

@constk constk commented May 25, 2026

Copy link
Copy Markdown
Owner

What & why

pip-audit on develop flags two transitive-dep CVEs (surfaced via fastapi / httpx):

Package Current Fixed in Advisory
idna 3.13 3.15+ CVE-2026-45409
starlette 1.0.0 1.0.1+ PYSEC-2026-161

Bumped via:

uv lock --upgrade-package idna --upgrade-package starlette

Resolves to idna 3.16 (3.15 was the listed fix; 3.16 is a further patch on the same line) and starlette 1.1.0 (minor bump). The currently pinned fastapi 0.136.1 accepts Starlette 1.1.x — confirmed locally by running the unit suite. All 192 unit tests pass on the upgraded lock; pip-audit then returns "No known vulnerabilities found".

Also bumps the project self-version 0.2.10 → 0.2.11 per docs/DEVELOPMENT.md.

Why this is its own PR. Code review on #99 (the README opener PR) recommended landing the CVE bumps separately so the dep-bump risk doesn't pile onto a docs-only change. PRs #99, #100, #101, #102/#105 all currently inherit these CVEs from develop and cannot pass pip-audit until this PR lands — so this is the unblocking commit for the whole release-blockers stack.

Test plan

  • uv lock --upgrade-package idna --upgrade-package starlette regenerates the lock cleanly
  • uv sync --frozen --extra dev succeeds with the new lock
  • uv run pytest tests/ -q → 192 passed (fastapi 0.136.1 + starlette 1.1.0 compatible)
  • uv run pip-audit → "No known vulnerabilities found"
  • CI pip-audit gate passes

Invariants affected

None.

New deps / actions / external surface

No new direct deps; two transitive deps bumped. starlette jumps minor (1.0.0 → 1.1.0) — verified compatible with the pinned fastapi 0.136.1.

Linked issue

None — surfaced by the #99 code review.

pip-audit on develop is flagging two transitive-dep CVEs:

- idna 3.13            CVE-2026-45409   (fix in 3.15+)
- starlette 1.0.0      PYSEC-2026-161   (fix in 1.0.1+)

Both are surfaced via fastapi/httpx. Bumps via:

    uv lock --upgrade-package idna --upgrade-package starlette

Resolves to idna 3.16 (3.15 was the listed fix; 3.16 is a further
patch with the same fix) and starlette 1.1.0 (minor bump; FastAPI is
compatible with it). All 192 unit tests pass on the upgraded lock.

Bumps the project self-version 0.2.10 -> 0.2.11 per
docs/DEVELOPMENT.md.

Unblocks the pip-audit CI gate on #99, #100, #101, #102 (and any
other PRs currently sitting on develop), all of which inherit the
flagged transitive CVEs from develop and cannot pass that gate until
this lands.
@constk constk merged commit d256e32 into develop May 26, 2026
22 checks passed
constk added a commit that referenced this pull request May 26, 2026
* feat: eval pattern examples calling Azure OpenAI (#94)

The eval slice previously shipped one toy case (echo-hello) and a
disabled-by-default nightly. A reader expecting an LLM-eval story
found the infrastructure without conviction.

Adds four worked-pattern cases that exercise the existing three
tolerance modes against a real Azure OpenAI deployment. These are
not benchmarks — they demonstrate what an eval case *looks like* for
the four LLM-eval patterns you most often need to write:

  - factual-http-200             exact_match       format-constrained recall
  - numeric-seconds-per-day      numeric_close     numeric reasoning + tolerance
  - definitional-fastapi-depends semantic_similar  free-form judge-scored prose
  - structured-json-status       exact_match       structured-output adherence

When the template is forked for a real project, replace these four
with cases that exercise the project's own prompts; the patterns
transfer regardless of what product is bolted on.

Provider choice — Azure OpenAI via the openai SDK with AzureOpenAI
client — is intentionally distinct from the rest of the harness
(which uses Claude via Claude Code). Demonstrates that the LLMClient
Protocol in src/eval/judge.py does its job: the eval core never
imports openai, vendor lock-in lives only in the adapter.

Changes:

  - src/eval/adapters/azure_openai.py — implements LLMClient via the
    openai.AzureOpenAI SDK. Reads endpoint/key/deployment/api-version
    from env. Lazy-imports the SDK so the module is importable without
    the optional extra installed; the adapter raises a clear
    AzureOpenAIConfigError if the env or SDK is missing.

  - eval/golden_patterns.json — the four cases with notes explaining
    which pattern each demonstrates.

  - eval/test_golden_patterns.py — separate test file gated on the
    Azure env vars via pytestmark. Skipped on a stock checkout, so
    `uv run pytest eval/` always exits 0. The toy test_golden_qa.py
    keeps running as before.

  - pyproject.toml — new optional [project.optional-dependencies] eval
    extra (just `openai>=1.40.0`), mypy override for openai.* matching
    the existing opentelemetry.* pattern, and a 0.2.10 -> 0.2.11
    self-version bump.

  - .github/workflows/eval-nightly.yml — env vars renamed from the
    placeholder LLM_* set to AZURE_OPENAI_*. Header comment updated
    with the Azure setup recipe. uv sync now passes --extra eval.

  - docs/EVAL_HARNESS.md — new "Worked patterns" section with the
    table mapping case -> tolerance -> pattern, the local setup
    recipe, and a "Swapping providers" note documenting the
    Protocol-based extension path.

Local gates: mypy --strict clean on 42 source files (was 31), ruff
clean, ruff format clean, import-linter both contracts kept, 192
unit tests pass, eval/ runs 1 passed + 4 skipped without LLM env.

Closes #94

* test: add adapter unit tests + adapters README (#94 review fixes)

Addresses two gate failures on #104 surfaced by code review:

1. "Tests required" gate — feat: prefix declared a behaviour change
   but tests/ had no test for the new adapter (the eval/-side test
   only runs with live Azure credentials). Adds
   tests/test_eval_azure_openai_adapter.py: 13 fully-offline cases
   covering _resolve_config (defaults, override, empty-string
   fallback, missing-env error listing), the constructor (env
   wiring, explicit API version, missing-env, missing-SDK), and the
   two SDK call paths (complete_json structured-output mode,
   complete user-message dispatch, null-content returns "" / "{}").

   The SDK is mocked at sys.modules level so the test never hits the
   network and never requires the openai extra to be installed.

2. "src/ README audit" gate — every src/ package needs a README.md
   per CLAUDE.md. Adds src/eval/adapters/README.md documenting the
   layer's purpose, the current adapter, a 7-step "adding a new
   adapter" recipe, and why the layer lives at the top of the import
   order.

Also applies the reviewer's non-blocking sentinel-string suggestion:
the magic "azure-deployment" string passed as judge_model in
eval/test_golden_patterns.py is now the named constant
_AZURE_DEPLOYMENT_SENTINEL with a comment explaining why the runner
threads it through but the Azure adapter discards it.

Local gates: 205 unit tests pass (was 192, +13 new), mypy clean on
43 source files, ruff/format/import-linter all green.

Refs #94

* docs: add Key interfaces section to adapters README (#94 review)

src/ README audit gate looks for a `## Key interfaces` (or `## Public
surface`) anchor — the existing README had purpose / table /
extension recipe / layering rationale, but no exported-names section.

Adds a `## Key interfaces` section listing the two exported names:

  - AzureOpenAIClient — the LLMClient implementation with notes on
    complete() vs complete_json() and the discarded `model` arg
    (Azure dispatches by deployment, not model).
  - AzureOpenAIConfigError — the construction-time error type,
    noting that it batches every missing env var into a single
    message instead of failing-and-retrying.

Both already documented in the adapter docstrings; this section
hoists them to the README anchor the audit gate enforces.

Refs #94

* chore: bump version to 0.2.12 (rebase onto develop after #103)
constk added a commit that referenced this pull request May 26, 2026
…sed post-#103/#104)

main moved ahead of develop on 2026-05-25 when PR #86 was merged
directly to main rather than via develop -> release flow. The
divergence is one squash commit (eff5b1c) carrying:

  - docs/BEADS.md (optional Beads issue-queue guidance)
  - .github/pull_request_template.md (Beads PR-template block)
  - .github/scripts/check_aspirational_tickets.py (PEP 758 reformat)
  - .github/scripts/check_pin_freshness.py / check_tests_present.py /
    check_version_bump.py (touch-ups)
  - .gitattributes / .gitignore (.beads/ ignore, Windows renormalise)
  - CONTRIBUTING.md (line-ending normalisation)
  - tests/test_scripts_compile.py (new CI-script compile gate)
  - docs/DEVELOPMENT.md / docs/HARNESS.md / docs/HARNESS_PRIMER.md
    cross-refs
  - pyproject.toml + uv.lock self-version 0.2.10 -> 0.2.11

This PR was rebased after #103 (CVE fix, develop -> 0.2.11) and
#104 (eval pattern examples, develop -> 0.2.12) merged. The version
on main (0.2.11) is now behind develop (0.2.12); the conflict is
resolved by bumping develop -> 0.2.13.

After this lands, develop is at 0.2.13 and contains everything main
has. Remaining in-flight PRs (#99, #100, #101, #105) need to rebase
to bump 0.2.13 -> 0.2.14 (and onward sequentially as they merge).

No behaviour change beyond what #86 already added to main.

# Conflicts:
#	pyproject.toml
#	uv.lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant