Skip to content

🤖 fix: harden empty-output and task report recovery#2986

Merged
ammario merged 5 commits intomainfrom
fix/empty-output-recovery
Mar 17, 2026
Merged

🤖 fix: harden empty-output and task report recovery#2986
ammario merged 5 commits intomainfrom
fix/empty-output-recovery

Conversation

@ammar-agent
Copy link
Copy Markdown
Collaborator

@ammar-agent ammar-agent commented Mar 16, 2026

Summary

  • Harden empty-output stream recovery so chats retry seamlessly when the provider drops a stream before any assistant-visible output arrives.
  • Keep plan-mode subagents on explicit propose_plan, but let ordinary subagents finalize from a clean final assistant response when the provider reports a natural stop finish reason.
  • Keep awaiting_report / report-only recovery for incomplete child turns, and interrupt that recovery cleanly on non-retryable errors.

Background

  • We reproduced two related failure modes: provider/model runs that ended without any assistant-visible output, and subagent runs that did substantial work but timed out before calling agent_report.
  • The old task recovery flow could convert those incomplete subagent runs into misleading fallback reports, making it look like the task had completed even though no real completion signal had arrived.
  • We also found that explicit agent_report prompts were adding avoidable ceremony for ordinary successful subagents whose final assistant turn already contained the full report text.

Implementation

  • StreamManager
    • Retry one empty-output completion inline before surfacing an error.
    • Preserve accumulated usage/accounting when that inline retry is taken.
    • Capture the provider/model finish reason on stream-end and classify no-output failures as a dedicated empty_output stream error.
  • HistoryService
    • Delete empty assistant placeholders when an errored partial contains no commit-worthy content, so retries do not feed back into hidden [CONTINUE] loops.
  • TaskService
    • Keep propose_plan explicit for plan-like subagents.
    • For ordinary subagents, treat a clean final assistant response as an implicit report only when the stream ends with finishReason === "stop", then route it through the same report artifact / parent-delivery / cleanup pipeline used by explicit agent_report.
    • Preserve report-only recovery for turns that still end without usable final text, and interrupt that recovery instead of looping forever on non-retryable stream errors.
    • Continue using the shared promptTaskForRequiredCompletionTool() helper for startup recovery, waiter nudges, stream-end recovery, and retryable stream-error retries.
  • UI
    • Render empty_output as a clearer “No assistant output” stream error.
    • Show awaiting_report tasks as awaiting report to reflect that they are still live and unfinished.
  • Test robustness
    • Keep the Windows background-bash migration race test fix so CI no longer assumes the first status snapshot is already exited.

Validation

  • make typecheck
  • make static-check
  • bun test src/node/services/partialService.test.ts
  • bun test src/node/services/streamManager.test.ts
  • bun test src/node/services/taskService.test.ts
  • bun test src/node/services/tools/task.test.ts src/node/services/tools/task_await.test.ts

Risks

  • Retryable awaiting_report recovery still intentionally prefers “keep trying to finish the child honestly” over fabricating a synthetic report, so persistent provider issues may keep tasks active longer and consume more tokens.
  • Implicit report finalization is intentionally limited to ordinary subagents on a clean natural stop; recovery turns, length-truncated responses, and plan-like tasks still require explicit completion tools to avoid ambiguous state transitions.

Generated with mux • Model: openai:gpt-5.4 • Thinking: xhigh • Cost: $122.19

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

@ammar-agent ammar-agent force-pushed the fix/empty-output-recovery branch from 2d66df9 to 5a7875a Compare March 16, 2026 16:12
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2d66df9e68

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

@ammar-agent ammar-agent force-pushed the fix/empty-output-recovery branch from 5a7875a to f36eacf Compare March 16, 2026 16:16
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f36eacfb0d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent ammar-agent force-pushed the fix/empty-output-recovery branch from f36eacf to 511f8d6 Compare March 16, 2026 16:32
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Please take another look.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 511f8d6468

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent ammar-agent force-pushed the fix/empty-output-recovery branch from 511f8d6 to 9f21b1f Compare March 16, 2026 16:44
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Please take another look.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9f21b1f37f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Improve silent empty-output handling so chats retry seamlessly when possible and surface clearer fallback reports when recovery keeps failing.

---

_Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `$58.55`_

<!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=58.55 -->
@ammar-agent ammar-agent force-pushed the fix/empty-output-recovery branch from 9f21b1f to e84c7c1 Compare March 16, 2026 16:56
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Please take another look.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent ammar-agent changed the title 🤖 fix: harden empty-output task recovery Mar 16, 2026
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Please take another look.

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Please take another look at the latest commit.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 61f8d47068

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Please take another look.

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Please take another look.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7ede6db8f1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Please take another look.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammario ammario merged commit 3b4c077 into main Mar 17, 2026
40 of 42 checks passed
@ammario ammario deleted the fix/empty-output-recovery branch March 17, 2026 01:39
ammario pushed a commit that referenced this pull request Mar 17, 2026
## Summary
This follow-up to #2986 keeps the existing `awaiting_report` recovery
behavior but trims one special case: waiter-triggered recovery now
reuses the standard completion reminder instead of carrying a separate
waiter-only prompt variant.

## Background
PR #2986 introduced a few paths that can re-prompt an `awaiting_report`
task. The waiter-specific reminder text and enum branch were the odd
ones out, because they duplicated the same completion-tool guidance with
different copy. Reusing the normal reminder keeps the recovery path
while simplifying the state machine.

## Implementation
- waiter-triggered recovery still nudges `awaiting_report` tasks, but
now reuses the default completion reminder path
- dropped the now-unused `"waiter"` completion-recovery reason and
prompt string
- added a regression test that proves `waitForAgentReport()` no longer
emits distinct waiter-only reminder copy

## Validation
- `bun test src/node/services/taskService.test.ts`
- `make static-check`

## Risks
Low. This keeps the waiter-side recovery hook intact; it only removes
the separate waiter-only reminder variant.

---

_Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` •
Cost: `$146.35`_

<!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=146.35
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants