Release Date: March 23, 2026
The platform expansion release β OpenAI-compatible API server, 6 new messaging adapters, 4 new inference providers, MCP server management with OAuth 2.1, @ context references, gateway prompt caching, streaming enabled by default, and a sweeping reliability pass with 200+ bug fixes.
-
OpenAI-compatible API server β Expose Hermes as an
/v1/chat/completionsendpoint with a new/api/jobsREST API for cron job management, hardened with input limits, field whitelists, SQLite-backed response persistence, and CORS origin protection (#1756, #2450, #2456, #2451, #2472) -
6 new messaging platform adapters β Signal, DingTalk, SMS (Twilio), Mattermost, Matrix, and Webhook adapters join Telegram, Discord, and WhatsApp. Gateway auto-reconnects failed platforms with exponential backoff (#2206, #1685, #1688, #1683, #2166, #2584)
-
@ context references β Claude Code-style
@fileand@urlcontext injection with tab completions in the CLI (#2343, #2482) -
4 new inference providers β GitHub Copilot (OAuth + token validation), Alibaba Cloud / DashScope, Kilo Code, and OpenCode Zen/Go (#1924, #1879 by @mchzimm, #1673, #1666, #1650)
-
MCP server management CLI β
hermes mcpcommands for installing, configuring, and authenticating MCP servers with full OAuth 2.1 PKCE flow (#2465) -
Gateway prompt caching β Cache AIAgent instances per session, preserving Anthropic prompt cache across turns for dramatic cost reduction on long conversations (#2282, #2284, #2361)
-
Context compression overhaul β Structured summaries with iterative updates, token-budget tail protection, configurable summary endpoint, and fallback model support (#2323, #1727, #2224)
-
Streaming enabled by default β CLI streaming on by default with proper spinner/tool progress display during streaming mode, plus extensive linebreak and concatenation fixes (#2340, #2161, #2258)
- @ context completions β Tab-completable
@file/@urlreferences that inject file content or web pages into the conversation (#2482, #2343) /statusbarβ Toggle a persistent config bar showing model + provider info in the prompt (#2240, #1917)/queueβ Queue prompts for the agent without interrupting the current run (#2191, #2469)/permissionβ Switch approval mode dynamically during a session (#2207)/browserβ Interactive browser sessions from the CLI (#2273, #1814)/costβ Live pricing and usage tracking in gateway mode (#2180)/approveand/denyβ Replaced bare text approval in gateway with explicit commands (#2002)
- Streaming enabled by default in CLI (#2340)
- Show spinners and tool progress during streaming mode (#2161)
- Show reasoning/thinking blocks when
show_reasoningenabled (#2118) - Context pressure warnings for CLI and gateway (#2159)
- Fix: streaming chunks concatenated without whitespace (#2258)
- Fix: iteration boundary linebreak prevents stream concatenation (#2413)
- Fix: defer streaming linebreak to prevent blank line stacking (#2473)
- Fix: suppress spinner animation in non-TTY environments (#2216)
- Fix: display provider and endpoint in API error messages (#2266)
- Fix: resolve garbled ANSI escape codes in status printouts (#2448)
- Fix: update gold ANSI color to true-color format (#2246)
- Fix: normalize toolset labels and use skin colors in banner (#1912)
- Fix: prevent 'Press ENTER to continue...' on exit (#2555)
- Fix: flush stdout during agent loop to prevent macOS display freeze (#1654)
- Fix: show human-readable error when
hermes setuphits permissions error (#2196) - Fix:
/stopcommand crash + UnboundLocalError in streaming media delivery (#2463) - Fix: allow custom/local endpoints without API key (#2556)
- Fix: Kitty keyboard protocol Shift+Enter for Ghostty/WezTerm (attempted + reverted due to prompt_toolkit crash) (#2345, #2349)
${ENV_VAR}substitution in config.yaml (#2684)- Real-time config reload β config.yaml changes apply without restart (#2210)
custom_models.yamlfor user-managed model additions (#2214)- Priority-based context file selection + CLAUDE.md support (#2301)
- Merge nested YAML sections instead of replacing on config update (#2213)
- Fix: config.yaml provider key overrides env var silently (#2272)
- Fix: log warning instead of silently swallowing config.yaml errors (#2683)
- Fix: disabled toolsets re-enable themselves after
hermes tools(#2268) - Fix: platform default toolsets silently override tool deselection (#2624)
- Fix: honor bare YAML
approvals.mode: off(#2620) - Fix:
hermes updateuse.[all]extras with fallback (#1728) - Fix:
hermes updateprompt before resetting working tree on stash conflicts (#2390) - Fix: use git pull --rebase in update/install to avoid divergent branch error (#2274)
- Fix: add zprofile fallback and create zshrc on fresh macOS installs (#2320)
- Fix: remove
ANTHROPIC_BASE_URLenv var to avoid collisions (#1675) - Fix: don't ask IMAP password if already in keyring or env (#2212)
- Fix: OpenCode Zen/Go show OpenRouter models instead of their own (#2277)
- GitHub Copilot β Full OAuth auth, API routing, token validation, and 400k context. (#1924, #1896, #1879 by @mchzimm, #2507)
- Alibaba Cloud / DashScope β Full integration with DashScope v1 runtime, model dot preservation, and 401 auth fixes (#1673, #2332, #2459)
- Kilo Code β First-class inference provider (#1666)
- OpenCode Zen and OpenCode Go β New provider backends (#1650, #2393 by @0xbyt4)
- NeuTTS β Local TTS provider backend with built-in setup flow, replacing the old optional skill (#1657, #1664)
- Eager fallback to backup model on rate-limit errors (#1730)
- Endpoint metadata for custom model context and pricing; query local servers for actual context window size (#1906, #2091 by @dusterbloom)
- Context length detection overhaul β models.dev integration, provider-aware resolution, fuzzy matching for custom endpoints,
/v1/propsfor llama.cpp (#2158, #2051, #2403) - Model catalog updates β gpt-5.4-mini, gpt-5.4-nano, healer-alpha, haiku-4.5, minimax-m2.7, claude 4.6 at 1M context (#1913, #1915, #1900, #2155, #2474)
- Custom endpoint improvements β
model.base_urlin config.yaml,api_modeoverride for responses API, allow endpoints without API key, fail fast on missing keys (#2330, #1651, #2556, #2445, #1994, #1998) - Inject model and provider into system prompt (#1929)
- Tie
api_modeto provider config instead of env var (#1656) - Fix: prevent Anthropic token leaking to third-party
anthropic_messagesproviders (#2389) - Fix: prevent Anthropic fallback from inheriting non-Anthropic
base_url(#2388) - Fix:
auxiliary_is_nousflag never resets β leaked Nous tags to other providers (#1713) - Fix: Anthropic
tool_choice 'none'still allowed tool calls (#1714) - Fix: Mistral parser nested JSON fallback extraction (#2335)
- Fix: MiniMax 401 auth resolved by defaulting to
anthropic_messages(#2103) - Fix: case-insensitive model family matching (#2350)
- Fix: ignore placeholder provider keys in activation checks (#2358)
- Fix: Preserve Ollama model:tag colons in context length detection (#2149)
- Fix: recognize Claude Code OAuth credentials in startup gate (#1663)
- Fix: detect Claude Code version dynamically for OAuth user-agent (#1670)
- Fix: OAuth flag stale after refresh/fallback (#1890)
- Fix: auxiliary client skips expired Codex JWT (#2397)
- Gateway prompt caching β Cache AIAgent per session, keep assistant turns, fix session restore (#2282, #2284, #2361)
- Context compression overhaul β Structured summaries, iterative updates, token-budget tail protection, configurable
summary_base_url(#2323, #1727, #2224) - Pre-call sanitization and post-call tool guardrails (#1732)
- Auto-recover from provider-rejected
tool_choiceby retrying without (#2174) - Background memory/skill review replaces inline nudges (#2235)
- SOUL.md as primary agent identity instead of hardcoded default (#1922)
- Fix: prevent silent tool result loss during context compression (#1993)
- Fix: handle empty/null function arguments in tool call recovery (#2163)
- Fix: handle API refusal responses gracefully instead of crashing (#2156)
- Fix: prevent stuck agent loop on malformed tool calls (#2114)
- Fix: return JSON parse error to model instead of dispatching with empty args (#2342)
- Fix: consecutive assistant message merge drops content on mixed types (#1703)
- Fix: message role alternation violations in JSON recovery and error handler (#1722)
- Fix:
compression_attemptsresets each iteration β allowed unlimited compressions (#1723) - Fix:
length_continue_retriesnever resets β later truncations got fewer retries (#1717) - Fix: compressor summary role violated consecutive-role constraint (#1720, #1743)
- Fix: remove hardcoded
gemini-3-flash-previewas default summary model (#2464) - Fix: correctly handle empty tool results (#2201)
- Fix: crash on None entry in
tool_callslist (#2209 by @0xbyt4, #2316) - Fix: per-thread persistent event loops in worker threads (#2214 by @jquesnelle)
- Fix: prevent 'event loop already running' when async tools run in parallel (#2207)
- Fix: strip ANSI at the source β clean terminal output before it reaches the model (#2115)
- Fix: skip top-level
cache_controlon role:tool for OpenRouter (#2391) - Fix: delegate tool β save parent tool names before child construction mutates global (#2083 by @ygd58, #1894)
- Fix: only strip last assistant message if empty string (#2326)
- Session search and management slash commands (#2198)
- Auto session titles and
.hermes.mdproject config (#1712) - Fix: concurrent memory writes silently drop entries β added file locking (#1726)
- Fix: search all sources by default in
session_search(#1892) - Fix: handle hyphenated FTS5 queries and preserve quoted literals (#1776)
- Fix: skip corrupt lines in
load_transcriptinstead of crashing (#1744) - Fix: normalize session keys to prevent case-sensitive duplicates (#2157)
- Fix: prevent
session_searchcrash when no sessions exist (#2194) - Fix: reset token counters on new session for accurate usage display (#2101 by @InB4DevOps)
- Fix: prevent stale memory overwrites by flush agent (#2687)
- Fix: remove synthetic error message injection, fix session resume after repeated failures (#2303)
- Fix: quiet mode with
--resumenow passes conversation_history (#2357) - Fix: unify resume logic in batch mode (#2331)
- Honcho config fixes and @ context reference integration (#2343)
- Self-hosted / Docker configuration documentation (#2475)
- Signal Messenger β Full adapter with attachment handling, group message filtering, and Note to Self echo-back protection (#2206, #2400, #2297, #2156)
- DingTalk β Adapter with gateway wiring and setup docs (#1685, #1690, #1692)
- SMS (Twilio) (#1688)
- Mattermost β With @-mention-only channel filter (#1683, #2443)
- Matrix β With vision support and image caching (#1683, #2520)
- Webhook β Platform adapter for external event triggers (#2166)
- OpenAI-compatible API server β
/v1/chat/completionsendpoint with/api/jobscron management (#1756, #2450, #2456)
- MarkdownV2 support β strikethrough, spoiler, blockquotes, escape parentheses/braces/backslashes/backticks (#2199, #2200 by @llbn, #2386)
- Auto-detect HTML tags and use
parse_mode=HTML(#1709) - Telegram group vision support + thread-based sessions (#2153)
- Auto-reconnect polling after network interruption (#2517)
- Aggregate split text messages before dispatching (#1674)
- Fix: streaming config bridge, not-modified, flood control (#1782, #1783)
- Fix: edited_message event crashes (#2074)
- Fix: retry 409 polling conflicts before giving up (#2312)
- Fix: topic delivery via
platform:chat_id:thread_idformat (#2455)
- Document caching and text-file injection (#2503)
- Persistent typing indicator for DMs (#2468)
- Discord DM vision β inline images + attachment analysis (#2186)
- Persist thread participation across gateway restarts (#1661)
- Fix: gateway crash on non-ASCII guild names (#2302)
- Fix: thread permission errors (#2073)
- Fix: slash event routing in threads (#2460)
- Fix: remove bugged followup messages +
/askcommand (#1836) - Fix: graceful WebSocket reconnection (#2127)
- Fix: voice channel TTS when streaming enabled (#2322)
- WhatsApp: outbound
send_messagerouting (#1769 by @sai-samarth), LID format self-chat (#1667),reply_prefixconfig fix (#1923), restart on bridge child exit (#2334), image/bridge improvements (#2181) - Matrix: correct
reply_to_message_idparameter (#1895), bare media types fix (#1736) - Mattermost: MIME types for media attachments (#2329)
- Auto-reconnect failed platforms with exponential backoff (#2584)
- Notify users when session auto-resets (#2519)
- Reply-to message context for out-of-session replies (#1662)
- Ignore unauthorized DMs config option (#1919)
- Fix:
/resetin thread-mode resets global session instead of thread (#2254) - Fix: deliver MEDIA: files after streaming responses (#2382)
- Fix: cap interrupt recursion depth to prevent resource exhaustion (#1659)
- Fix: detect stopped processes and release stale locks on
--replace(#2406, #1908) - Fix: PID-based wait with force-kill for gateway restart (#1902)
- Fix: prevent
--replacemode from killing the caller process (#2185) - Fix:
/modelshows active fallback model instead of config default (#1660) - Fix:
/titlecommand fails when session doesn't exist in SQLite yet (#2379 by @ten-jampa) - Fix: process
/queue'd messages after agent completion (#2469) - Fix: strip orphaned
tool_results+ let/resetbypass running agent (#2180) - Fix: prevent agents from starting gateway outside systemd management (#2617)
- Fix: prevent systemd restart storm on gateway connection failure (#2327)
- Fix: include resolved node path in systemd unit (#1767 by @sai-samarth)
- Fix: send error details to user in gateway outer exception handler (#1966)
- Fix: improve error handling for 429 usage limits and 500 context overflow (#1839)
- Fix: add all missing platform allowlist env vars to startup warning check (#2628)
- Fix: media delivery fails for file paths containing spaces (#2621)
- Fix: duplicate session-key collision in multi-platform gateway (#2171)
- Fix: Matrix and Mattermost never report as connected (#1711)
- Fix: PII redaction config never read β missing yaml import (#1701)
- Fix: NameError on skill slash commands (#1697)
- Fix: persist watcher metadata in checkpoint for crash recovery (#1706)
- Fix: pass
message_thread_idin send_image_file, send_document, send_video (#2339) - Fix: media-group aggregation on rapid successive photo messages (#2160)
- MCP server management CLI + OAuth 2.1 PKCE auth (#2465)
- Expose MCP servers as standalone toolsets (#1907)
- Interactive MCP tool configuration in
hermes tools(#1694) - Fix: MCP-OAuth port mismatch, path traversal, and shared handler state (#2552)
- Fix: preserve MCP tool registrations across session resets (#2124)
- Fix: concurrent file access crash + duplicate MCP registration (#2154)
- Fix: normalise MCP schemas + expand session list columns (#2102)
- Fix:
tool_choicemcp_prefix handling (#1775)
- Tavily as web search/extract/crawl backend (#1731)
- Parallel as alternative web search/extract backend (#1696)
- Configurable web backend β Firecrawl/BeautifulSoup/Playwright selection (#2256)
- Fix: whitespace-only env vars bypass web backend detection (#2341)
- IMAP email reading and sending (#2173)
- STT (speech-to-text) tool using Whisper API (#2072)
- Route-aware pricing estimates (#1695)
- TTS:
base_urlsupport for OpenAI TTS provider (#2064 by @hanai) - Vision: configurable timeout, tilde expansion in file paths, DM vision with multi-image and base64 fallback (#2480, #2585, #2211)
- Browser: race condition fix in session creation (#1721), TypeError on unexpected LLM params (#1735)
- File tools: strip ANSI escape codes from write_file and patch content (#2532), include pagination args in repeated search key (#1824 by @cutepawss), improve fuzzy matching accuracy + position calculation refactor (#2096, #1681)
- Code execution: resource leak and double socket close fix (#2381)
- Delegate: thread safety for concurrent subagent delegation (#1672), preserve parent agent's tool list after delegation (#1778)
- Fix: make concurrent tool batching path-aware for file mutations (#1914)
- Fix: chunk long messages in
send_message_toolbefore platform dispatch (#1646) - Fix: add missing 'messaging' toolset (#1718)
- Fix: prevent unavailable tool names from leaking into model schemas (#2072)
- Fix: pass visited set by reference to prevent diamond dependency duplication (#2311)
- Fix: Daytona sandbox lookup migrated from
find_onetoget/list(#2063 by @rovle)
- Agent-created skills β Caution-level findings allowed, dangerous skills ask instead of block (#1840, #2446)
--yesflag to bypass confirmation in/skills installand uninstall (#1647)- Disabled skills respected across banner, system prompt, and slash commands (#1897)
- Fix: skills custom_tools import crash + sandbox file_tools integration (#2239)
- Fix: agent-created skills with pip requirements crash on install (#2145)
- Fix: race condition in
Skills.__init__whenhub.yamlmissing (#2242) - Fix: validate skill metadata before install and block duplicates (#2241)
- Fix: skills hub inspect/resolve β 4 bugs in inspect, redirects, discovery, tap list (#2447)
- Fix: agent-created skills keep working after session reset (#2121)
- OCR-and-documents β PDF/DOCX/XLS/PPTX/image OCR with optional GPU (#2236, #2461)
- Huggingface-hub bundled skill (#1921)
- Sherlock OSINT username search (#1671)
- Meme-generation β Image generator with Pillow (#2344)
- Bioinformatics gateway skill β index to 400+ bio skills (#2387)
- Inference.sh skill (terminal-based) (#1686)
- Base blockchain optional skill (#1643)
- 3D-model-viewer optional skill (#2226)
- FastMCP optional skill (#2113)
- Hermes-agent-setup skill (#1905)
- TUI extension hooks β Build custom CLIs on top of Hermes (#2333)
hermes plugins install/remove/listcommands (#2337)- Slash command registration for plugins (#2359)
session:endlifecycle event hook (#1725)- Fix: require opt-in for project plugin discovery (#2215)
- SSRF protection for vision_tools and web_tools (#2679)
- Shell injection prevention in
_expand_pathvia~userpath suffix (#2685) - Block untrusted browser-origin API server access (#2451)
- Block sandbox backend creds from subprocess env (#1658)
- Block @ references from reading secrets outside workspace (#2601 by @Gutslabs)
- Malicious code pattern pre-exec scanner for terminal_tool (#2245)
- Harden terminal safety and sandbox file writes (#1653)
- PKCE verifier leak fix + OAuth refresh Content-Type (#1775)
- Eliminate SQL string formatting in
execute()calls (#2061 by @dusterbloom) - Harden jobs API β input limits, field whitelist, startup check (#2456)
- Thread locks on 4 SessionDB methods (#1704)
- File locking for concurrent memory writes (#1726)
- Handle OpenRouter errors gracefully (#2112)
- Guard print() calls against OSError (#1668)
- Safely handle non-string inputs in redacting formatter (#2392, #1700)
- ACP: preserve session provider on model switch, persist sessions to disk (#2380, #2071)
- API server: persist ResponseStore to SQLite across restarts (#2472)
- Fix:
fetch_nous_modelsalways TypeError from positional args (#1699) - Fix: resolve merge conflict markers in cli.py breaking startup (#2347)
- Fix:
minisweagent_path.pymissing from wheel (#2098 by @JiwaniZakir)
[SILENT]response β cron agents can suppress delivery (#1833)- Scale missed-job grace window with schedule frequency (#2449)
- Recover recent one-shot jobs (#1918)
- Fix: normalize
repeat<=0to None β jobs deleted after first run when LLM passes -1 (#2612 by @Mibayy) - Fix: Matrix added to scheduler delivery platform_map (#2167 by @buntingszn)
- Fix: naive ISO timestamps without timezone β jobs fire at wrong time (#1729)
- Fix:
get_due_jobsreadsjobs.jsontwice β race condition (#1716) - Fix: silent jobs return empty response for delivery skip (#2442)
- Fix: stop injecting cron outputs into gateway session history (#2313)
- Fix: close abandoned coroutine when
asyncio.run()raises RuntimeError (#2317)
- Resolve all consistently failing tests (#2488)
- Replace
FakePathwithmonkeypatchfor Python 3.12 compat (#2444) - Align Hermes setup and full-suite expectations (#1710)
- Comprehensive docs update for recent features (#1693, #2183)
- Alibaba Cloud and DingTalk setup guides (#1687, #1692)
- Detailed skills documentation (#2244)
- Honcho self-hosted / Docker configuration (#2475)
- Context length detection FAQ and quickstart references (#2179)
- Fix docs inconsistencies across reference and user guides (#1995)
- Fix MCP install commands β use uv, not bare pip (#1909)
- Replace ASCII diagrams with Mermaid/lists (#2402)
- Gemini OAuth provider implementation plan (#2467)
- Discord Server Members Intent marked as required (#2330)
- Fix MDX build error in api-server.md (#1787)
- Align venv path to match installer (#2114)
- New skills added to hub index (#2281)
- @teknium1 (Teknium) β 280 PRs
- @mchzimm (to_the_max) β GitHub Copilot provider integration (#1879)
- @jquesnelle (Jeffrey Quesnelle) β Per-thread persistent event loops fix (#2214)
- @llbn (lbn) β Telegram MarkdownV2 strikethrough, spoiler, blockquotes, and escape fixes (#2199, #2200)
- @dusterbloom β SQL injection prevention + local server context window querying (#2061, #2091)
- @0xbyt4 β Anthropic tool_calls None guard + OpenCode-Go provider config fix (#2209, #2393)
- @sai-samarth (Saisamarth) β WhatsApp send_message routing + systemd node path (#1769, #1767)
- @Gutslabs (Guts) β Block @ references from reading secrets (#2601)
- @Mibayy (Mibay) β Cron job repeat normalization (#2612)
- @ten-jampa (Tenzin Jampa) β Gateway /title command fix (#2379)
- @cutepawss (lila) β File tools search pagination fix (#1824)
- @hanai (Hanai) β OpenAI TTS base_url support (#2064)
- @rovle (Lovre PeΕ‘ut) β Daytona sandbox API migration (#2063)
- @buntingszn (bunting szn) β Matrix cron delivery support (#2167)
- @InB4DevOps β Token counter reset on new session (#2101)
- @JiwaniZakir (Zakir Jiwani) β Missing file in wheel fix (#2098)
- @ygd58 (buray) β Delegate tool parent tool names fix (#2083)
Full Changelog: v2026.3.17...v2026.3.23