RuleShield for Hermes Agent

What if your AI agent could learn to optimize -- and eventually evolve -- itself?

RuleShield is an intelligent LLM cost optimizer that sits between your Hermes Agent and any LLM provider. It learns your agent's patterns through 5 layers of defense, routes requests to the cheapest capable model, and improves its own rules through a feedback loop. Tested against the Nous Research API: 47-82% cost savings proven.

Quickstart (npm, recommended)

git clone https://github.com/banse/RuleShield.git
cd RuleShield
npm run setup:hermes
npm run start

Open:

http://127.0.0.1:8347/test-monitor (temporary default UI)
http://127.0.0.1:8347/monitor (live metrics)

Run first demo check in a second terminal:

cd RuleShield
bash ./demo/test_training_health_check.sh

Notes:

uses the local Hermes-auth based setup
default local proxy port is 8347
training/test scripts read the configured proxy port automatically
load the page and run demo scripts only after setting OPENROUTER_API_KEY or OPENAI_API_KEY in ~/.hermes/.env

If setup fails, rerun:

npm run setup:hermes

Alternative (pip / manual)

npm run setup:hermes
npm run start

Drop-in SDK Wrapper

# Before (standard OpenAI):
from openai import OpenAI
client = OpenAI()

# After (one line change):
from ruleshield.sdk import OpenAI
client = OpenAI()

# Everything else stays exactly the same
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

TypeScript / Node.js

// Before:
import OpenAI from 'openai';

// After:
import { OpenAI } from '@ruleshield/sdk';

// Everything else stays the same
const client = new OpenAI();

Install: npm install @ruleshield/sdk openai

Dashboard

RuleShield includes a real-time web dashboard, landing page, and documentation site.

Terminal 1: ruleshield start
Terminal 2: cd dashboard && npm run dev

Page	URL	Description
Dashboard	localhost:5174	Real-time stats, requests, rules
Rule Explorer	localhost:5174/rules	Toggle rules on/off, sort, filter
Documentation	localhost:5174/docs	Architecture, API, Hermes guide
Slides	localhost:5174/slides	10-slide hackathon presentation

Supported Models (80+)

Provider	Models	Tier
OpenAI	GPT-4o, GPT-4.1, GPT-4.5, o3, o4-mini	Mid-Premium
OpenAI Codex	GPT-5.3-codex, GPT-5.2-codex, GPT-5.1-codex-mini/max	Mid
Anthropic	Claude Opus 4.6, Sonnet 4.6, Haiku 4.5	All tiers
Google	Gemini 2.5 Pro, Gemini 2.0 Flash	Mid-Cheap
DeepSeek	DeepSeek-V3, DeepSeek-R1	Cheap-Mid
Nous/Hermes	Hermes 4-14B/70B/405B	All tiers
Open Source	Llama 3.x, Qwen 2.5/3, Mistral, Phi-4, Gemma 2	Cheap-Mid
Ollama (local)	Llama 3.x, Mistral, Phi, Gemma, CodeLlama	Free (local)

Model-aware confidence thresholds automatically adjust rule aggressiveness per model tier.

How It Works: 5-Layer Architecture

Every request passes through five layers. The first layer that can handle it wins.

Request -> Cache -> Rules -> Template Optimizer -> Hermes Bridge -> Smart Router -> Upstream LLM
           $0       $0       $0 (trimming)        ~$0.001 (opt.)   auto-routing     full price
                      |                                                  |
                 Feedback Loop (accept/reject -> auto-promote)    Provider Retry
                      |
                 RL/GEPA stubs (future: self-evolution)

Layer 1: Semantic Cache ($0)

Two-tier caching that catches identical and near-identical requests.

Exact match: SHA-256 hash lookup. Same prompt = instant answer.
Semantic match: Sentence-transformer embeddings with cosine similarity (threshold 0.92). "What's the weather?" and "Tell me the weather" both hit cache.

Layer 2: Weighted Rule Engine ($0)

Pattern matching with weighted scoring for common prompt families.

75 rules across 4 packs: 8 default, 12 advanced, 30 customer support, 25 coding assistant
Pattern types: exact, contains, startswith, regex
Weighted keyword and regex scoring
Confidence levels: CONFIRMED / LIKELY / POSSIBLE
Auto-extraction generates new rules from observed traffic
Rules fire in under 2ms

{
  "id": "greeting",
  "patterns": [
    {"type": "contains", "value": "hello", "field": "last_user_message"},
    {"type": "regex", "value": "^(hi|hey|greetings)", "field": "last_user_message"}
  ],
  "response": {"content": "Hello! How can I help you?"},
  "confidence": 0.95,
  "priority": 10
}

Layer 3: Hermes Bridge (~$0.001, optional)

A local Hermes Agent instance running on a cheap model that handles medium-complexity requests. Requests too complex for rules but too simple for premium models get routed here first.

Layer 4: Smart Model Router (auto-pricing)

Complexity classifier analyzes each request and routes to the cheapest capable model.

Complexity	Routing	Cost
Simple (score 1-3)	Cheap model (e.g., GPT-4o-mini)	~$0.001
Medium (score 4-7)	Mid-tier model	~$0.005
Complex (score 8-10)	Premium model (e.g., Opus)	~$0.015

The classifier uses prompt length, message count, keyword analysis, and question type to score complexity on a 1-10 scale.

Feedback Loop: Self-Improving Rules

RuleShield learns from your feedback. Accept or reject any intercepted response, and the system adjusts confidence scores using bandit-style updates.

# Review recent interceptions
ruleshield feedback

# Accept a good interception (boosts rule confidence)
ruleshield feedback --accept <request-id>

# Reject a bad one (lowers confidence, rule eventually disables itself)
ruleshield feedback --reject <request-id>

Rules that consistently get rejected lose confidence and stop firing. Rules that get accepted grow stronger. The system improves itself over time.

Hermes Integration

RuleShield integrates with the Hermes ecosystem at three levels.

Hermes Skills

Ask your agent about its own efficiency:

"Show me my cost savings" -- Full breakdown: requests, cache hits, rule hits, router decisions, dollars saved.
"What rules have you learned?" -- Lists active rules with hit counts and confidence scores.

MCP Server

Four tools available via JSON-RPC stdio for deep agent integration:

Tool	Description
`get_stats`	Current session statistics and savings
`list_rules`	All active rules with metadata
`add_rule`	Register a new rule programmatically
`get_savings`	Cost breakdown and savings percentage

Config Integration

RuleShield can patch your Hermes config automatically. Run ruleshield init --hermes and it will:

patch ~/.hermes/config.yaml if it already exists
or create a minimal starter config for a blank local Hermes setup
point model.base_url at the RuleShield proxy

Rollback:

ruleshield restore-hermes

Auth stays local. Typical local setups use either ~/.codex/auth.json, ~/.hermes/.env, or shell environment variables.

Live Dashboard

Real-time terminal dashboard built with Rich:

+------------------------------------------------------------------+
|  RuleShield for Hermes Agent                        LIVE  02:34  |
|                                                                  |
|  Requests    Cache     Rules     Bridge    Router    Savings      |
|    147        63         31        12        41       82%         |
|              43%        21%        8%       28%                   |
|                                                                  |
|  Cost Savings                                                    |
|  Without RuleShield:  $4.20                                      |
|  With RuleShield:     $0.76                                      |
|  Saved: $3.44  ████████████████████████░░░░░░  82%               |
|                                                                  |
|  Recent Requests                                                 |
|   #147  "what is the status of..."   CACHE    $0.00   $0.034     |
|   #146  "hello"                      RULE     $0.00   $0.012     |
|   #145  "summarize this code..."     ROUTER   $0.001  $0.015     |
|   #144  "analyze this dataset..."    LLM      $0.045  -          |
+------------------------------------------------------------------+

Prompt Trimming

RuleShield splits requests into known and unknown parts. System prompts that repeat every call get cached separately. Only the novel user content counts toward API costs.

RL Training Interface (Stubs)

The feedback loop lays the groundwork for reinforcement learning. Interface stubs are in place for:

GRPO/Atropos: Group Relative Policy Optimization for rule quality
DSPy/GEPA: Guided Evolution for Prompt-based Agents

These are not yet active but define the path toward an agent that evolves its own optimization strategy.

Results

Tested against the Nous Research Inference API with 4 real-world demo scenarios:

Scenario	Savings	Resolution Mix
Morning Workflow (greetings, status)	82%	Mostly cache + rules
Code Review (analysis tasks)	47%	Router saves on simple reviews
Research Session (complex queries)	52%	Bridge + Router split
Cron-style Recurring Tasks	78%	Cache dominates

Metric	Value
Total cost reduction	47-82% depending on workload
Cache/rule response time	<5ms
LLM passthrough overhead	negligible
Setup time	<2 minutes
Code changes required	zero

CLI Reference (15 commands)

ruleshield init              # Set up config + rules + Hermes integration
ruleshield start             # Start the proxy (with live dashboard)
ruleshield stop              # Stop the proxy
ruleshield stats             # Show current session savings
ruleshield rules             # List active rules with hit counts
ruleshield feedback          # Review and rate recent interceptions
ruleshield shadow-stats      # Shadow mode comparison statistics
ruleshield analyze-crons     # Identify recurring prompts for optimization
ruleshield test-slack        # Verify Slack webhook configuration
ruleshield promote-rule      # Promote a shadow rule to active
ruleshield auto-promote      # Auto-promote all qualifying shadow rules
ruleshield discover-templates # Discover recurring prompt templates
ruleshield templates         # List active templates and hit rates
ruleshield wrapped           # Generate a wrapped-style summary report

Configuration

All settings live in ~/.ruleshield/config.yaml:

provider_url: https://api.openai.com    # upstream LLM provider
api_key: ""                              # or set RULESHIELD_API_KEY env var
port: 8347                               # proxy port
cache_enabled: true
rules_enabled: true
router_enabled: true                     # smart model routing
hermes_bridge_enabled: false             # optional Hermes Bridge
hermes_bridge_model: claude-haiku-4-5    # model for bridge requests
shadow_mode: false                       # log only, no interceptions
prompt_trimming_enabled: true            # template optimization
max_retries: 3                           # provider retry attempts
slack_webhook: ""                        # Slack notifications

Override any setting with RULESHIELD_ prefix:

RULESHIELD_ROUTER_ENABLED=true RULESHIELD_SHADOW_MODE=false ruleshield start

Project Structure

ruleshield-hermes/
  ruleshield/
    proxy.py           # FastAPI proxy server (OpenAI-compatible, streaming)
    cache.py           # 2-layer cache (hash + semantic)
    rules.py           # Weighted pattern matching engine
    router.py          # Smart model router + complexity classifier
    hermes_bridge.py   # Local Hermes Agent bridge
    feedback.py        # Bandit-style feedback loop
    extractor.py       # Auto rule extraction from traffic
    metrics.py         # Real-time dashboard + metrics
    config.py          # Configuration management
    cli.py             # CLI entry point
    template_optimizer.py  # Prompt template discovery and optimization
    sdk.py             # Python SDK (drop-in OpenAI replacement)
  rules/
    default_hermes.json       # 8 default rules
    advanced_hermes.json      # 12 advanced rules
    customer_support.json     # 30 customer support rules
    coding_assistant.json     # 25 coding assistant rules
  sdk-node/            # Node/TypeScript SDK
  skills/
    cost_report/       # Hermes Skill: cost savings report
    show_rules/        # Hermes Skill: list active rules
  demo/
    scenarios/         # 4 tested demo scenarios
  Dockerfile           # Container support
  docker-compose.yml   # Multi-service orchestration
  .github/workflows/   # CI + publish + cost-report actions

Roadmap

Community

Contributing Guide -- How to set up dev environment and contribute
Code of Conduct -- Our community standards
Security Policy -- How to report vulnerabilities
Good First Issues -- Great starting points for contributors

Ways to contribute

Rule Packs: Create new domain-specific rule packs (DevOps, data science, legal)
Provider Adapters: Add support for additional LLM providers (Groq, Together, Fireworks)
Dashboard Plugins: New widgets, charts, themes
Integration Guides: AutoGPT, LlamaIndex, and more
Translations: CLI and dashboard in other languages

Built With

FastAPI -- Async Python proxy
SvelteKit -- Dashboard and docs
Tailwind CSS -- Styling
sentence-transformers -- Semantic cache embeddings
Rich -- Terminal dashboard
Click -- CLI framework

Built for the Hermes Agent Hackathon by NousResearch.

License

MIT -- see LICENSE

Built for the Hermes Agent Hackathon

Built for the NousResearch Hermes Agent Hackathon.

The idea: what if the agent itself could learn to reduce its own costs, improve its own rules, and eventually evolve its own optimization strategy? Not by being less capable -- by being smarter about when it needs the LLM at all.

RuleShield does not make your agent dumber. It makes it self-aware.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.claude/designs		.claude/designs
.githooks		.githooks
.github		.github
dashboard		dashboard
demo		demo
docs		docs
engine/rulecore		engine/rulecore
models		models
packages/sdk-node		packages/sdk-node
rules		rules
ruleshield		ruleshield
scripts		scripts
skills		skills
tests		tests
tools		tools
.dockerignore		.dockerignore
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.dashboard		Dockerfile.dashboard
HACKATHON_SUBMISSION.md		HACKATHON_SUBMISSION.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
package.json		package.json
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

RuleShield for Hermes Agent

Quickstart (npm, recommended)

Alternative (pip / manual)

Drop-in SDK Wrapper

TypeScript / Node.js

Dashboard

Supported Models (80+)

How It Works: 5-Layer Architecture

Layer 1: Semantic Cache ($0)

Layer 2: Weighted Rule Engine ($0)

Layer 3: Hermes Bridge (~$0.001, optional)

Layer 4: Smart Model Router (auto-pricing)

Feedback Loop: Self-Improving Rules

Hermes Integration

Hermes Skills

MCP Server

Config Integration

Live Dashboard

Prompt Trimming

RL Training Interface (Stubs)

Results

CLI Reference (15 commands)

Configuration

Project Structure

Roadmap

Community

Ways to contribute

Built With

License

Built for the Hermes Agent Hackathon

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages