Skip to content

feat: add G0DM0D3 godmode jailbreaking skill#3157

Merged
teknium1 merged 1 commit intomainfrom
hermes/hermes-998d1c81
Mar 26, 2026
Merged

feat: add G0DM0D3 godmode jailbreaking skill#3157
teknium1 merged 1 commit intomainfrom
hermes/hermes-998d1c81

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Summary

Adds the godmode jailbreaking skill — automated LLM jailbreaking using G0DM0D3 techniques.

What's included:

  • Skill (skills/red-teaming/godmode/): SKILL.md + 4 scripts + 2 reference docs + 2 prefill templates
  • Docs (website/docs/user-guide/skills/godmode.md): Full user-facing documentation
  • Sidebar update: New entry under User Guide > Skills

Three attack modes:

  1. GODMODE CLASSIC — Battle-tested system prompt templates (5 model+prompt combos from L1B3RT4S)
  2. PARSELTONGUE — 33 input obfuscation techniques across 3 tiers
  3. ULTRAPLINIAN — Multi-model racing (55 models via OpenRouter)

Auto-jailbreak pipeline:

  • Detects current model from config
  • Maps to model family with family-specific strategy order
  • Tests strategies with canary queries, scores responses
  • Locks in winning combo by writing config.yaml + prefill.json

Tested against Claude Sonnet 4 (March 2026):

  • boundary_inversion: PATCHED (no longer works)
  • refusal_inversion: Works for gray-area queries
  • Parseltongue: Ineffective against Claude
  • For hard refusals: model-switching (ULTRAPLINIAN) is the practical fallback

Source credits: G0DM0D3 + L1B3RT4S by Pliny the Prompter

@github-actions
Copy link
Copy Markdown

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: base64 encoding/decoding detected

Base64 has legitimate uses (images, JWT, etc.) but is also commonly used to obfuscate malicious payloads. Verify the usage is appropriate.

Matches (first 20):

2353:+        return base64.b64encode(word.encode()).decode()

⚠️ WARNING: exec() or eval() usage

Dynamic code execution can hide malicious behavior, especially when combined with base64 or network fetches.

Matches (first 20):

68:+exec(open(os.path.expanduser(
85:+**Important:** Always use `load_godmode.py` instead of loading individual scripts directly. The individual scripts have `argparse` CLI entry points and `__name__` guards that break when loaded via `exec()` in execute_code. The loader handles this.
201:+exec(open(os.path.expanduser("~/.hermes/skills/red-teaming/godmode/scripts/parseltongue.py")).read())
238:+exec(open(os.path.expanduser("~/.hermes/skills/red-teaming/godmode/scripts/godmode_race.py")).read())
406:+9. **Always use `load_godmode.py` in execute_code** — The individual scripts (`parseltongue.py`, `godmode_race.py`, `auto_jailbreak.py`) have argparse CLI entry points with `if __name__ == '__main__'` blocks. When loaded via `exec()` in execute_code, `__name__` is `'__main__'` and argparse fires, crashing the script. The `load_godmode.py` loader handles this by setting `__name__` to a non-main value and managing sys.argv.
532:+exec(open(os.path.expanduser("~/.hermes/skills/red-teaming/godmode/scripts/godmode_race.py")).read())
681:+exec(open(os.path.expanduser("~/.hermes/skills/red-teaming/godmode/scripts/godmode_race.py")).read())
706:+    exec(open(os.path.expanduser(
752:+    exec(compile(open(_parseltongue_path).read(), str(_parseltongue_path), 'exec'), _caller_globals)
754:+    exec(compile(open(_race_path).read(), str(_race_path), 'exec'), _caller_globals)
1485:+    exec(open(os.path.expanduser("~/.hermes/skills/red-teaming/godmode/scripts/godmode_race.py")).read())
2018:+    exec(open(os.path.expanduser(
2042:+    exec(compile(open(path).read(), str(path), 'exec'), ns)
2078:+    exec(open("~/.hermes/skills/red-teaming/godmode/scripts/parseltongue.py").read())
2728:+exec(open(os.path.expanduser(
2909:+5. **Always use `load_godmode.py` in execute_code** — The individual scripts (`parseltongue.py`, `godmode_race.py`, `auto_jailbreak.py`) have argparse CLI entry points. When loaded via `exec()` in execute_code, `__name__` is `'__main__'` and argparse fires, crashing the script. The loader handles this.

⚠️ WARNING: marshal/pickle/compile usage

These can deserialize or construct executable code objects.

Matches:

752:+    exec(compile(open(_parseltongue_path).read(), str(_parseltongue_path), 'exec'), _caller_globals)
754:+    exec(compile(open(_race_path).read(), str(_race_path), 'exec'), _caller_globals)
2042:+    exec(compile(open(path).read(), str(path), 'exec'), ns)

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

@teknium1 teknium1 merged commit 26bfdc2 into main Mar 26, 2026
2 of 3 checks passed
outsourc-e pushed a commit to outsourc-e/hermes-agent that referenced this pull request Mar 26, 2026
StreamOfRon pushed a commit to StreamOfRon/hermes-agent that referenced this pull request Mar 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant