fix: eliminate 3x SQLite message duplication in gateway sessions by teknium1 · Pull Request #873 · NousResearch/hermes-agent

teknium1 · 2026-03-10T22:23:03Z

Summary

Fixes #860 — SQLite session transcripts accumulated duplicate messages (3-4x token inflation).

Root Cause

Three separate code paths all wrote to the same state.db with no deduplication:

_log_msg_to_db() — wrote each message individually right after messages.append()
_flush_messages_to_session_db() — re-wrote ALL new messages at every _persist_session() call (~18 exit points), with no tracking of what was already written
Gateway append_to_transcript() — wrote everything a third time after the agent returned

Since load_transcript() prefers SQLite over JSONL, the inflated data was loaded on every session resume, causing proportional token waste.

Fix

run_agent.py:

Remove _log_msg_to_db() method and all 16 call sites (redundant with the flush mechanism)
Add _last_flushed_db_idx tracking in _flush_messages_to_session_db() so repeated _persist_session() calls only write truly new messages
Reset flush cursor on compression (new session ID)

gateway/session.py:

Add skip_db parameter to SessionStore.append_to_transcript() — when True, writes JSONL only

gateway/run.py:

Pass skip_db=True when the agent already persisted messages to SQLite
JSONL backup writes still happen (backward compat)

Verification

Live-tested with hermes chat: a 12-message session with 10 tool calls produces exactly 12 SQLite rows with zero duplicates (previously would have been 36-48).

Tests

9 new tests in tests/test_860_dedup.py covering:
- Flush deduplication (repeated calls write nothing new)
- Incremental flush (only new messages written)
- Multiple _persist_session calls (no duplication)
- Compression reset (flush cursor resets for new session)
- skip_db=True prevents SQLite writes
- skip_db=False (default) writes to both stores
- _last_flushed_db_idx initialization
Updated test_interrupt.py to remove reference to deleted _log_msg_to_db
Full suite: 2869 passed, 0 failed

Three separate code paths all wrote to the same SQLite state.db with no deduplication, inflating session transcripts by 3-4x: 1. _log_msg_to_db() — wrote each message individually after append 2. _flush_messages_to_session_db() — re-wrote ALL new messages at every _persist_session() call (~18 exit points), with no tracking of what was already written 3. gateway append_to_transcript() — wrote everything a third time after the agent returned Since load_transcript() prefers SQLite over JSONL, the inflated data was loaded on every session resume, causing proportional token waste. Fix: - Remove _log_msg_to_db() and all 16 call sites (redundant with flush) - Add _last_flushed_db_idx tracking in _flush_messages_to_session_db() so repeated _persist_session() calls only write truly new messages - Reset flush cursor on compression (new session ID) - Add skip_db parameter to SessionStore.append_to_transcript() so the gateway skips SQLite writes when the agent already persisted them - Gateway now passes skip_db=True for agent-managed messages, still writes to JSONL as backup Verified: a 12-message CLI session with tool calls produces exactly 12 SQLite rows with zero duplicates (previously would be 36-48). Tests: 9 new tests covering flush deduplication, skip_db behavior, compression reset, and initialization. Full suite passes (2869 tests).

teknium1 merged commit 6e851a1 into main Mar 10, 2026
1 check passed

DiamondEyesFox mentioned this pull request Mar 11, 2026

Context compression silently drops history due to Responses API / stream kwarg mismatch in auxiliary client #886

Closed

teknium1 mentioned this pull request Mar 13, 2026

fix: deduplicate SQLite session writes via _last_flushed_db_idx #870

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: eliminate 3x SQLite message duplication in gateway sessions#873

fix: eliminate 3x SQLite message duplication in gateway sessions#873
teknium1 merged 1 commit intomainfrom
hermes/hermes-281ff8aa

teknium1 commented Mar 10, 2026

Uh oh!

Labels

1 participant

Conversation