Skip to content

fix: eliminate 3x SQLite message duplication in gateway sessions#873

Merged
teknium1 merged 1 commit intomainfrom
hermes/hermes-281ff8aa
Mar 10, 2026
Merged

fix: eliminate 3x SQLite message duplication in gateway sessions#873
teknium1 merged 1 commit intomainfrom
hermes/hermes-281ff8aa

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Summary

Fixes #860 — SQLite session transcripts accumulated duplicate messages (3-4x token inflation).

Root Cause

Three separate code paths all wrote to the same state.db with no deduplication:

  1. _log_msg_to_db() — wrote each message individually right after messages.append()
  2. _flush_messages_to_session_db() — re-wrote ALL new messages at every _persist_session() call (~18 exit points), with no tracking of what was already written
  3. Gateway append_to_transcript() — wrote everything a third time after the agent returned

Since load_transcript() prefers SQLite over JSONL, the inflated data was loaded on every session resume, causing proportional token waste.

Fix

run_agent.py:

  • Remove _log_msg_to_db() method and all 16 call sites (redundant with the flush mechanism)
  • Add _last_flushed_db_idx tracking in _flush_messages_to_session_db() so repeated _persist_session() calls only write truly new messages
  • Reset flush cursor on compression (new session ID)

gateway/session.py:

  • Add skip_db parameter to SessionStore.append_to_transcript() — when True, writes JSONL only

gateway/run.py:

  • Pass skip_db=True when the agent already persisted messages to SQLite
  • JSONL backup writes still happen (backward compat)

Verification

Live-tested with hermes chat: a 12-message session with 10 tool calls produces exactly 12 SQLite rows with zero duplicates (previously would have been 36-48).

Tests

  • 9 new tests in tests/test_860_dedup.py covering:
    • Flush deduplication (repeated calls write nothing new)
    • Incremental flush (only new messages written)
    • Multiple _persist_session calls (no duplication)
    • Compression reset (flush cursor resets for new session)
    • skip_db=True prevents SQLite writes
    • skip_db=False (default) writes to both stores
    • _last_flushed_db_idx initialization
  • Updated test_interrupt.py to remove reference to deleted _log_msg_to_db
  • Full suite: 2869 passed, 0 failed
Three separate code paths all wrote to the same SQLite state.db with
no deduplication, inflating session transcripts by 3-4x:

1. _log_msg_to_db() — wrote each message individually after append
2. _flush_messages_to_session_db() — re-wrote ALL new messages at
   every _persist_session() call (~18 exit points), with no tracking
   of what was already written
3. gateway append_to_transcript() — wrote everything a third time
   after the agent returned

Since load_transcript() prefers SQLite over JSONL, the inflated data
was loaded on every session resume, causing proportional token waste.

Fix:
- Remove _log_msg_to_db() and all 16 call sites (redundant with flush)
- Add _last_flushed_db_idx tracking in _flush_messages_to_session_db()
  so repeated _persist_session() calls only write truly new messages
- Reset flush cursor on compression (new session ID)
- Add skip_db parameter to SessionStore.append_to_transcript() so the
  gateway skips SQLite writes when the agent already persisted them
- Gateway now passes skip_db=True for agent-managed messages, still
  writes to JSONL as backup

Verified: a 12-message CLI session with tool calls produces exactly
12 SQLite rows with zero duplicates (previously would be 36-48).

Tests: 9 new tests covering flush deduplication, skip_db behavior,
compression reset, and initialization. Full suite passes (2869 tests).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant