Skip to content

Webhook Middleware Phase 3: Mutating webhook middleware with JSONPatch#4372

Open
Sanskarzz wants to merge 2 commits intostacklok:mainfrom
Sanskarzz:dynamicwebhook3
Open

Webhook Middleware Phase 3: Mutating webhook middleware with JSONPatch#4372
Sanskarzz wants to merge 2 commits intostacklok:mainfrom
Sanskarzz:dynamicwebhook3

Conversation

@Sanskarzz
Copy link
Copy Markdown
Contributor

@Sanskarzz Sanskarzz commented Mar 26, 2026

Summary

Implements Phase 3 of the Dynamic Webhook Middleware system: Mutating Webhook Middleware, as outlined in RFC THV-0017.

This PR adds support for mutating webhooks that allow external HTTP services to transform incoming MCP requests dynamically using RFC 6902 JSON Patch operations. Mutating webhooks are executed after the MCP parser but before validating webhooks, allowing organizations to cleanly intercept and rewrite requests (e.g., adding default parameters, applying transformations, or injecting audit trails) before validation or authorization logic runs.

Fixes #3398

Changes

1. pkg/webhook/mutating/ Package

Introduces the core logic for mutating webhooks:

  • config.go: Defines configuration parameter types (MiddlewareParams, FactoryMiddlewareParams) mirroring the validating package.
  • patch.go: Implements RFC 6902 JSONPatch parsing and application using github.com/evanphx/json-patch/v5.
    • Includes ValidatePatch() to ensure only supported operations are allowed.
    • Implements Scope Validation (IsPatchScopedToMCPRequest()) to ensure webhooks can only modify the mcp_request container inside the webhook payload, preventing unauthorized modification of the request principal, context, or envelope metadata.
  • middleware.go: The HTTP middleware factory and handler.
    • Skips non-MCP requests gracefully.
    • Executes chained mutating webhooks in configuration order, passing the output of one as the input to the next.
    • Extracts the raw MCP request, evaluates it against the webhook endpoints, and applies the returned JSONPatches.
    • Supports RFC-defined failure policies (fail vs ignore) for connection errors or malformed patches.

2. Runner Wiring (pkg/runner/)

  • config.go: Added MutatingWebhooks []webhook.Config to RunConfig to allow configuring mutating webhooks natively via state persistence or configuration YAML.
  • middleware.go: Registered the mutating webhook middleware factory and inserted it into the middleware execution chain. It executes proactively before the validating webhook middleware, matching the RFC specifications.

3. Dependencies

  • Added github.com/evanphx/json-patch/v5 for RFC 6902-compliant patch application.

4. Addressed Phase 2 leftover review comments

  • Inlined the variables in sendErrorResponse in the validating webhook middleware.
  • Renamed the misleading test in pkg/webhook/validating/middleware_test.go.
  • Extracted duplicate convertToJSONRPC2ID implementations from all 3 middlewares (authz, validating, mutating) to a central location (pkg/mcp/utils.go).

5. Full Middleware Chain Integration Test

  • Added TestWebhookMiddlewareChainIntegration to pkg/runner/webhook_integration_test.go to test the execution flow of both mutating and validating webhooks as initialized by the runner configuration parsing.

6. Test Coverage Improvements:

  • Added new tests checking failure states of patch extraction and application to increase pkg/webhook/mutating/patch.go and pkg/webhook/mutating/middleware.go coverage.
  • Addressed low coverage for PopulateMiddlewareConfigs in pkg/runner/middleware.go by adding TestPopulateMiddlewareConfigs_FullCoverage which spans across all configuration branch setups.

Type of change

  • Bug fix
  • New feature
  • Refactoring (no behavior change)
  • Dependency update
  • Documentation
  • Other (describe):

Test plan

  • Unit tests (task test)
  • E2E tests (task test-e2e)
  • Linting (task lint-fix)
  • Manual testing (describe below)
@github-actions github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Mar 26, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 26, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 84.65116% with 33 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.62%. Comparing base (e3c605f) to head (8694b4f).

Files with missing lines Patch % Lines
pkg/webhook/mutating/middleware.go 82.87% 19 Missing and 6 partials ⚠️
pkg/runner/middleware.go 76.47% 2 Missing and 2 partials ⚠️
pkg/webhook/mutating/patch.go 85.71% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4372      +/-   ##
==========================================
+ Coverage   69.48%   69.62%   +0.14%     
==========================================
  Files         486      490       +4     
  Lines       50017    50199     +182     
==========================================
+ Hits        34753    34953     +200     
+ Misses      12578    12548      -30     
- Partials     2686     2698      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 26, 2026
@Sanskarzz Sanskarzz marked this pull request as ready for review March 26, 2026 21:31
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 26, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 26, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 26, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 27, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 27, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 27, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 27, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 27, 2026
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 28, 2026
@Sanskarzz
Copy link
Copy Markdown
Contributor Author

@JAORMX The PR is Ready for review.

@JAORMX
Copy link
Copy Markdown
Collaborator

JAORMX commented Mar 28, 2026

Amazing! I'll check it out on Monday. Had a day off on Friday, that's why I hadn't given feedback.

Copy link
Copy Markdown
Collaborator

@JAORMX JAORMX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Sanskarzz! Nice work on Phase 3. The overall structure is clean, the test coverage is solid, and the RFC 6902 patch logic looks correct. The middleware chain ordering matches the RFC, the scope validation is a good security measure, and the refactoring of ConvertToJSONRPC2ID into shared utils is a welcome cleanup.

I have a couple of things I think we should address before merging (the 403 vs 500 on denied requests, and the patch envelope containing sensitive data unnecessarily), plus some questions and suggestions for your consideration.

Generated with Claude Code


if !resp.Allowed {
slog.Info("Mutating webhook denied request", "webhook", whName, "reason", resp.Reason)
sendErrorResponse(w, http.StatusInternalServerError, "Request mutation denied by webhook", msgID)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocker: So... when a webhook explicitly says allowed: false, that's a policy decision, not a server error. Right now this returns HTTP 500, but the validating middleware uses 403 for the same scenario (and rightly so). Using 500 here would confuse anyone triaging alerts... "is the webhook broken, or did it deliberately deny the request?"\n\nThe RFC failure modes table doesn't have a specific row for allowed: false on mutating webhooks, but semantically 403 is the right call here.\n\nsuggestion\n\t\tsendErrorResponse(w, http.StatusForbidden, \"Request denied by webhook policy\", msgID)\n\n\nNote that if we change this, the //nolint:unparam on sendErrorResponse becomes justified since statusCode will no longer always be 500.

return nil, fmt.Errorf("patch scope violation")
}

envelopeJSON, err := json.Marshal(whReq)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: This marshals the entire webhook request (including principal with email/subject/groups and context with source IP) into the envelope just so the patch paths like /mcp_request/... resolve correctly.\n\nWhile IsPatchScopedToMCPRequest prevents patches from targeting those fields, the sensitive data is still sitting in the document being patched. If the json-patch library ever surfaces document content in error messages, or if there's a bug in scope validation, the principal data could leak or be modified.\n\nA safer approach... construct a minimal envelope with only what the patches need:\n\ngo\nenvelope := map[string]json.RawMessage{\"mcp_request\": json.RawMessage(currentBody)}\nenvelopeJSON, err := json.Marshal(envelope)\n\n\nThis way, even if scope validation has a bug, there's nothing sensitive to modify. Defense in depth!


resp, err := exec.client.CallMutating(ctx, whReq)
if err != nil {
if exec.config.FailurePolicy == webhook.FailurePolicyIgnore {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Per RFC THV-0017's failure modes table, HTTP 422 from a webhook should always deny regardless of failure policy (Deny (422) for both fail and ignore columns). The client already surfaces the status code in InvalidResponseError (there's even a comment about it in client.go:147), but the middleware treats all errors uniformly via the failure policy.\n\nWith ignore policy, a 422 would currently result in using the original body instead of denying... which the RFC says shouldn't happen.\n\nI know this is also a pre-existing gap in the validating middleware from Phase 2, so it doesn't have to be fixed in this PR. But could you file a follow-up issue tracking the 422 handling for both middleware types?

Comment on lines +182 to +183

if !resp.Allowed {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: One more thing here... when Allowed is false, we always deny regardless of failure policy. That seems intentional (an explicit deny is different from an operational error), and I think it's the right call. But it's worth adding a short comment explaining the design decision, since the RFC failure modes table doesn't address this case for mutating webhooks. Something like:\n\nsuggestion\n\t// Explicit denial from a mutating webhook is always honored, regardless of failure policy.\n\t// This differs from operational errors (network, timeout) where the failure policy applies.\n\tif !resp.Allowed {\n

jsonpatch "github.com/evanphx/json-patch/v5"
)

// patchTypJSONPatch is the patch_type value for RFC 6902 JSON Patch.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Tiny typo in the comment... patchTypJSONPatch is missing the 'e' in Type.\n\nsuggestion\n// patchTypeJSONPatch is the patch_type value for RFC 6902 JSON Patch.\n

// or maliciously modifying principal, context, or other immutable envelope fields.
func IsPatchScopedToMCPRequest(patch []JSONPatchOp) bool {
for _, op := range patch {
if !strings.HasPrefix(op.Path, mcpRequestPathPrefix) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: The prefix check requires /mcp_request/ (with trailing slash). So a patch targeting /mcp_request exactly (no trailing slash, e.g. a replace on the entire object) would be rejected.\n\nIs that intentional? It means a webhook can't do a full replacement of the MCP request body via a single replace op on /mcp_request. I think rejecting it is actually the safer default (forces granular patches), but wanted to confirm that's the intent and not an accidental off-by-one in the prefix.

const MiddlewareType = "mutating-webhook"

// Middleware wraps mutating webhook functionality for the factory pattern.
type Middleware struct {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Both the mutating and validating middlewares have nearly identical Middleware struct, clientExecutor type, CreateMiddleware factory, sendErrorResponse, and readSourceIP. I totally get keeping the packages independent (easier to reason about, no coupling), but I'm curious... was this a deliberate choice?\n\nIf more webhook types come in later phases, we might end up with N copies. Might be worth extracting the common scaffold to pkg/webhook/ at some point. Not asking for it in this PR though... just want to know if there was a design rationale here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

2 participants