Skip to content

fix: return get_files patches as raw text to avoid double-encoding#2666

Open
rodboev wants to merge 2 commits into
github:mainfrom
rodboev:pr/get-files-raw-patches
Open

fix: return get_files patches as raw text to avoid double-encoding#2666
rodboev wants to merge 2 commits into
github:mainfrom
rodboev:pr/get-files-raw-patches

Conversation

@rodboev

@rodboev rodboev commented Jun 10, 2026

Copy link
Copy Markdown

Summary

Fixes get_files, which double-encodes per-file patch content by marshaling the entire file array to JSON (escaping newlines) and then wrapping that JSON string in a TextContent field that the MCP transport JSON-encodes again, roughly doubling escape overhead on patch content and pushing moderate PRs past token limits.

Why

GetPullRequestFiles passes the []MinimalPRFile slice (including each file's Patch string) to MarshalledTextResult, which calls json.Marshal to produce a JSON array, then wraps it in NewToolResultText. On the MCP wire, TextContent.Text is JSON-encoded a second time, so every \n in every patch becomes \\n (4 bytes instead of 1). For a ~73 KB diff, this inflates the payload to 125 KB+, exceeding tool response token limits on PRs that are not unusually large.

get_diff is not affected: it already returns raw diff text via NewToolResultText(string(raw)), which gets only the single unavoidable MCP wire encoding.

Fixes #2242

What changed

  • pkg/github/pullrequests.go: GetPullRequestFiles now returns a multi-content CallToolResult. Content[0] is a JSON array of file metadata (filename, status, additions, deletions, changes, previous_filename) without patches. Content[1..N] are raw TextContent blocks, one per file with a non-empty patch, containing the filename as a header followed by the patch text. Patches are no longer embedded inside a JSON string value, eliminating the double-encoding.
  • pkg/github/pullrequests_test.go: Test_GetPullRequestFiles verifies the multi-content response structure: metadata JSON in Content[0], raw patch text blocks in Content[1..N], and no patch field in the metadata JSON.

MCP impact

  • No tool or API changes
  • Tool schema or behavior changed
    The tool's response format changes from a single JSON text block with embedded patch strings to multiple text content blocks: metadata JSON plus raw patch text. Tool parameters, name, description, and annotations are unchanged, so toolsnaps and generated docs are unaffected.
  • New tool added

Prompts tested (tool changes only)

Security / limits

  • No security or limits impact
  • Auth / permissions considered
    The tool continues to use the existing pull request file API path and repo permission requirements; no new scope or permission check is added.
  • Data exposure, filtering, or token/size limits considered
    This returns the same file metadata and patch content in separate content blocks, reducing encoding overhead without adding fields or API calls.

Tool renaming

  • I am renaming tools as part of this PR (e.g. a part of a consolidation effort)
    • I have added the new tool aliases in deprecated_tool_aliases.go
  • I am not renaming tools as part of this PR

Lint & tests

  • Linted locally with ./script/lint
    Not run locally.

  • Tested locally with ./script/test
    Not run locally. The full race suite was not run locally.

  • go test -race ./pkg/github -run Test_GetPullRequestFiles - covers: multi-content response structure, metadata-only JSON without patches, raw patch text blocks with filename headers, pagination, and error cases.

Docs

  • Not needed
    Tool parameters and schema are unchanged. The response format change is behavioral, not schema-level. Toolsnaps are unaffected.
@rodboev rodboev requested a review from a team as a code owner June 10, 2026 20:31
@rodboev rodboev force-pushed the pr/get-files-raw-patches branch from 439fd3a to 1503978 Compare June 11, 2026 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant