feat(server): fork session by user-message ordinal#3250
Conversation
POST /api/sessions/:id/fork now takes a 0-based ordinal counting only user-role messages instead of an index into the flat all-roles list. Clients only have to know which of the user's turns they clicked on; the server translates that into the underlying item index. This decouples the contract from the precise layout of tool responses and assistant turns, which a streaming client cannot fully observe until persistence completes.
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
The fork-by-ordinal logic is correct — the userMessageOrdinalToItemIndex translation is sound and the seen > ordinal invariant holds throughout. The two gaps are both test-coverage issues on newly introduced code paths, not runtime bugs.
Summary of findings:
- MEDIUM (CONFIRMED): The sub-session branch in
userMessageOrdinalToItemIndex(the newcountUserMessageshelper andErrForkInSubSessiontrigger) has no test coverage in the diff. - LOW (LIKELY):
ordinal=0on a session with zero user messages returnsErrForkOutOfRange(correct behaviour) but is not explicitly tested.
No correctness bugs were found in the new ordinal translation logic.
| // TestUserMessageOrdinalToItemIndex covers the ordinal translation | ||
| // helper: only user-role messages count; system and assistant items | ||
| // are skipped; out-of-range ordinals are rejected. | ||
| func TestUserMessageOrdinalToItemIndex(t *testing.T) { |
There was a problem hiding this comment.
[MEDIUM] No test coverage for sub-session paths in userMessageOrdinalToItemIndex
The new countUserMessages helper and the sub-session boundary check (ordinal-seen < subCount) introduced in this PR are entirely untested. TestUserMessageOrdinalToItemIndex only exercises flat message lists (user, system, assistant items) — no IsSubSession() items appear anywhere in the test.
Three new behaviours have no coverage:
countUserMessagescorrectness — counting user-role messages inside a flattened sub-session.ErrForkInSubSessiontrigger —ordinal-seen < subCountfiring when the ordinal lands inside a sub-session.- Empty sub-session passthrough — when
subCount == 0, the guard is skipped andseenis unchanged; the ordinal should resolve to a later user message in the parent.
Consider adding a sub-table test case to TestUserMessageOrdinalToItemIndex that includes an IsSubSession() item and asserts all three paths.
| subCount := len(item.SubSession.GetAllMessages()) | ||
| if flatIdx-seen < subCount { | ||
| return 0, fmt.Errorf("%w at index %d", ErrForkInSubSession, flatIdx) | ||
| subCount := countUserMessages(item.SubSession.GetAllMessages()) |
There was a problem hiding this comment.
[LOW] countUserMessages silently skips sub-sessions of sub-sessions
countUserMessages is called with item.SubSession.GetAllMessages(), which recursively flattens nested sub-sessions. That means a sub-session that itself contains a sub-sub-session will have its user messages counted transitively. The sub-session boundary check then applies the combined count, which is the correct semantics.
This is a minor point, but worth confirming via a test: an ordinal targeting the first user message after a nested sub-session should not trigger ErrForkInSubSession. The current test suite has no coverage for nested sub-sessions at all.
Why
The fork endpoint currently takes a flat-list index into the session's mixed-role message stream (user + assistant + tool, system filtered, sub-sessions flattened). To target "the Nth user message the user clicked on", a client has to translate that intent into a flat index by mirroring the rules the server applies — skip system messages, count tool responses, recurse into sub-sessions.
A streaming client can't reliably do that translation: at the moment a user message lands in the UI, the tool calls and responses that will follow it on the server haven't happened yet, so the client has no way to know what flat index the server will eventually assign. The result is that fork is unavailable on just-sent messages until the client refetches the session.
What changes
The request body changes from
message_index(flat-list index) touser_message_index(0-based ordinal counting only user-role messages in the same visible list). The targeted user message is still excluded so the client can prefill it into chat input.Fork at ordinal 1 = fork before the second user message — same semantics as the previous "fork at flat index 3".
Backwards-incompatible wire change. The previous shape returned 400 for an assistant-targeted index; the new shape can't express that ambiguity, so
ErrForkInvalidMessageis dropped.