memex

OSS Memory Manager for AI Coding Assistants — filesystem-native, tool-agnostic, LLM-optional.

Current AI coding assistants (Claude Code, Cursor, Aider, Windsurf, Codex) all have memory systems, but they're all broken in the same ways: memory files bloat over time, stale entries waste context tokens, there's no deduplication, no decay, no quality scoring, and no cross-tool portability. memex fixes all of that.

Features

L1/L2/L3 cache hierarchy — always-loaded index, on-demand topic files, searchable archive
Jaccard dedup — 4-operation model (ADD/UPDATE/DELETE/NOOP) with similarity-based dedup
Configurable decay — half-life per category (preferences never expire, project context decays in 14 days)
Cross-tool export — Claude Code, Cursor, Aider, AGENTS.md from a single source of truth
Claude Code hooks — auto-extract memories on PreCompact, Stop, SessionEnd, SessionStart
MCP server mode — mid-conversation memory access via memex serve
Code scanning — ast-grep/semgrep detect conventions directly from source code
Memory validation — cross-reference memories against actual code, flag contradictions
LoCoMo benchmark — built-in harness to measure memory quality with F1 scoring
LLM-optional — core features (add, search, audit, prune, export) work without an API key

Quick Start

Install from npm

npm install -g memex-ai

Initialize in your project

cd your-project
memex init

This creates a .memex/ directory:

.memex/
├── config.json   — configuration (thresholds, decay rates)
├── index.md      — L1 memory index (always loaded, ~50-80 lines)
├── topics/       — L2 topic files (loaded on-demand, <100 lines each)
└── archive/      — L3 JSON entries (full metadata, searchable)

Add memories

memex add "Always use TypeScript strict mode" --category preference --tags "typescript,config"
memex add "API uses JWT in httpOnly cookies" --category decision --tags "auth,security" --confidence 0.85
memex add "Python 3.14 selected by uv despite >=3.12 constraint" --category gotcha --tags "python,uv"

Duplicate detection runs automatically — similar entries merge instead of duplicating.

Search, audit, and prune

memex search "typescript"          # Fuzzy search across all tiers
memex status                       # Dashboard: counts, staleness, token budget
memex audit                        # Score entries, flag stale/duplicate
memex prune --dry-run              # Preview what would be removed
memex prune                        # Archive entries below score threshold

Export to your tools

memex export --claude              # → .claude/memory/MEMORY.md + CLAUDE.md
memex export --cursor              # → .cursor/rules/*.mdc
memex export --aider               # → CONVENTIONS.md
memex export --agents-md           # → AGENTS.md (universal format)
memex export --all                 # All of the above

Scan codebase for conventions

memex scan                         # Auto-detect languages, scan with ast-grep
memex scan --backend semgrep       # Use semgrep instead
memex scan ./src --lang typescript # Scan specific dir/language
memex scan --dry-run               # Preview without saving
memex scan --benchmark             # Compare ast-grep vs semgrep speed & accuracy

Detects patterns like: async/await vs promises, interface vs type alias, error handling style, import conventions, test patterns, and more — across TypeScript, JavaScript, Python, Go, and Rust.

Validate memories against code

memex validate                     # Check all memories against codebase
memex validate --backend semgrep   # Use semgrep for validation
memex validate --json              # Machine-readable output

Flags memories that are contradicted by actual code (e.g., memory says "uses interfaces" but code has mostly type aliases).

Local Development Setup

Prerequisites

Node.js >= 18.0.0
npm >= 9.0.0
Git

Clone and install

git clone https://github.com/rachittshah/memex.git
cd memex
npm install

Build

npm run build        # Compile TypeScript → dist/
npm run dev          # Watch mode (recompiles on changes)

Run locally

# Run directly from source
node dist/cli.js --help

# Or link globally for development
npm link
memex --help

Test

npm test             # Run all 88 tests
npm run test:watch   # Watch mode
npm run lint         # Type-check without emitting

Project structure

src/
├── cli.ts                  # CLI entry point (commander.js)
├── core/
│   ├── schema.ts           # MemoryEntry types, validation, factories
│   ├── store.ts            # CRUD for L3 archive (atomic JSON writes)
│   ├── tiers.ts            # L1/L2 tier management (markdown files)
│   └── index.ts            # Index builder (L1 generation from L2/L3)
├── algorithms/
│   ├── scoring.ts          # Effective score: confidence × access × decay
│   ├── dedup.ts            # Jaccard similarity + 4-op model
│   ├── decay.ts            # Half-life decay calculator
│   └── promote.ts          # Tier promotion/demotion logic
├── llm/
│   ├── extract.ts          # LLM-powered memory extraction from transcripts
│   └── consolidate.ts      # LLM-powered merge and dedup
├── exporters/
│   ├── claude.ts           # CLAUDE.md + MEMORY.md export
│   ├── cursor.ts           # .cursor/rules/*.mdc export
│   ├── aider.ts            # CONVENTIONS.md export
│   └── agents-md.ts        # AGENTS.md universal export
├── hooks/
│   ├── claude-code.ts      # Claude Code hook handlers
│   ├── mcp-server.ts       # MCP server mode (JSON-RPC over stdio)
│   └── generic.ts          # Generic transcript extraction
├── scanner/
│   ├── patterns.ts         # Built-in ast-grep/semgrep patterns (TS, Python, Go, Rust)
│   ├── scanner.ts          # Code scanning engine (ast-grep + semgrep backends)
│   └── validator.ts        # Memory validation against code
├── bench/
│   ├── locomo.ts           # LoCoMo dataset loader
│   ├── runner.ts           # Benchmark execution engine
│   ├── evaluator.ts        # F1 scoring + metrics
│   ├── baselines.ts        # Baseline implementations (+ scan baseline)
│   └── report.ts           # Results formatting
└── commands/
    ├── init.ts, status.ts, add.ts, search.ts
    ├── audit.ts, prune.ts, export.ts, scan.ts, validate.ts
    ├── consolidate.ts, extract.ts, bench.ts, serve.ts
tests/
    ├── schema.test.ts      # 9 tests
    ├── store.test.ts       # 13 tests
    ├── scoring.test.ts     # 6 tests
    ├── dedup.test.ts       # 15 tests
    ├── decay.test.ts       # 11 tests
    ├── tiers.test.ts       # 19 tests
    ├── exporters.test.ts   # 9 tests
    └── commands.test.ts    # 6 tests (CLI integration)

Claude Code Integration

Auto-install hooks

memex init --claude

This installs hooks into .claude/settings.json that fire on Claude Code lifecycle events:

Hook	Event	What memex does
`PreCompact`	Before context compression	Extracts memories from conversation about to be compressed (highest-signal moment)
`Stop` (async)	After Claude responds	Background extraction from last assistant message
`SessionEnd`	Session terminates	Final extraction + consolidation check
`SessionStart`	Session begins	Injects L1 index + relevant L2 topics into context

memex writes to .claude/memory/MEMORY.md so it works alongside Claude Code's native auto-memory — it's additive, not a replacement.

MCP Server

For deeper integration, run memex as an MCP server:

memex serve

Add to your Claude Code MCP config:

{
  "mcpServers": {
    "memex": {
      "command": "memex",
      "args": ["serve"]
    }
  }
}

This gives Claude access to memory_search, memory_add, and memory_stats tools mid-conversation.

LLM-Powered Features

These features require an Anthropic API key:

export ANTHROPIC_API_KEY=sk-ant-...
npm install @anthropic-ai/sdk     # Optional dependency

Extract memories from transcripts

memex extract --file transcript.jsonl
cat transcript.json | memex extract --from-stdin --trigger session-end

Consolidate (merge + dedup)

memex consolidate              # LLM merges overlapping entries
memex consolidate --dry-run    # Preview without applying

LoCoMo Benchmark

Measure memory quality against the LoCoMo dataset (10 long-term conversations, 300 turns each):

memex bench                          # Full benchmark
memex bench --quick                  # Quick (2 conversations)
memex bench --baselines none,naive   # Compare against baselines
memex bench --export results.json    # Export results
memex bench --ci --threshold 0.65    # CI mode (exit 1 if F1 < 65%)

Baselines: none (no memory), naive (full text), l1 (index only), l2 (index + topics), full (all tiers).

Memory Schema

Each entry in the archive:

{
  "id": "uuid",
  "content": "Always use TypeScript strict mode",
  "category": "preference",
  "confidence": 0.95,
  "access_count": 3,
  "last_accessed": "2025-02-24T06:30:00.000Z",
  "created": "2025-02-20T10:00:00.000Z",
  "updated": "2025-02-24T06:30:00.000Z",
  "decay_days": Infinity,
  "source": "manual",
  "tags": ["typescript", "config"],
  "related_files": [],
  "status": "active"
}

Categories and decay

Category	Half-life	Use case
`preference`	Never	User preferences, permanent conventions
`decision`	90 days	Architectural/design choices
`pattern`	60 days	Recurring approaches, conventions
`tool`	45 days	Tool-specific knowledge
`gotcha`	30 days	Bugs, pitfalls, warnings
`project`	14 days	Project context (fast-moving)

Scoring

effective_score = confidence × max(1, log2(access_count + 1)) × decay_factor

Entries scoring below 0.3 are flagged as stale
Entries scoring below 0.1 are flagged as critical and pruned

Tier promotion

Transition	Condition
L3 → L2	`access_count > 3` and `confidence > 0.7`
L2 → L1	`access_count > 10` (cross-project pattern)
L1 → L2	Not accessed for 30 days
L2 → L3	Effective score < 0.3

CLI Reference

memex init [--claude] [--force]      Initialize .memex directory
memex status                         Dashboard with health metrics
memex add <text> [options]           Add a memory entry
  --category <cat>                     pattern|decision|gotcha|preference|project|tool
  --tags <tags>                        Comma-separated tags
  --confidence <n>                     0.0-1.0 (default: 0.7)
memex search <query> [options]       Fuzzy search across tiers
  --category <cat>                     Filter by category
  --limit <n>                          Max results (default: 10)
memex audit [--json]                 Score and flag entries
memex prune [options]                Remove low-scoring entries
  --threshold <n>                      Score threshold (default: 0.1)
  --dry-run                            Preview without removing
  --hard                               Permanently delete (vs archive)
memex export [options]               Export to tool formats
  --claude / --cursor / --aider / --agents-md / --all
memex consolidate [--dry-run]        LLM-powered merge (needs API key)
memex extract [options]              Extract from transcripts (needs API key)
  --from-stdin                         Read from stdin (hook mode)
  --file <path>                        Read from file
  --trigger <event>                    pre-compact|stop|session-end
memex bench [options]                Run LoCoMo benchmark
  --quick                              2 conversations only
  --baselines <list>                   Comma-separated baselines
  --ci --threshold <n>                 CI mode with F1 threshold
memex scan [dir] [options]          Scan codebase for conventions
  --backend <backend>                  ast-grep | semgrep | auto
  --lang <languages>                   Comma-separated languages
  --min-matches <n>                    Minimum matches to report
  --dry-run                            Preview without saving
  --benchmark                          Compare ast-grep vs semgrep
memex validate [dir] [options]      Validate memories against code
  --backend <backend>                  ast-grep | semgrep | auto
  --json                               Machine-readable output
memex serve                          Start MCP server (stdio)

Scanner Benchmark: ast-grep vs semgrep

memex supports two code scanning backends. Run memex scan --benchmark to compare them on your codebase. Here are results from three real-world projects:

Vercel AI SDK (3,490 TS/JS files)

Backend	Time	Patterns Checked	Detected	Total Matches
ast-grep	15.2s	21	14	8,235
semgrep	180.0s	26	5	6,855

ast-grep is 11.8x faster and detects nearly 3x more patterns.

Pattern-by-pattern comparison

Pattern	ast-grep	semgrep	Agreement
async-await	258	0	divergent
arrow-functions	328	0	divergent
try-catch-error-handling	248	0	divergent
named-exports	327	0	divergent
default-exports	354	0	divergent
describe-it-tests	1,828	0	divergent
console-error-logging	206	0	divergent
optional-chaining	2,196	3,992	divergent
nullish-coalescing	1,222	2,504	divergent
interface-over-type	279	359	divergent
type-aliases	985	0	divergent
enum-usage	4	0	divergent

memex (34 TS files)

Backend	Time	Patterns Checked	Detected	Total Matches
ast-grep	254ms	12	7	168
semgrep	37.4s	12	4	2,783

ast-grep is 147x faster. Semgrep's inflated match count comes from an overly broad enum pattern (2,693 false positives).

Pattern-by-pattern comparison

Pattern	ast-grep	semgrep	Agreement
async-await	0	0	match
arrow-functions	0	0	match
try-catch-error-handling	6	0	divergent
interface-over-type	32	32	match
type-aliases	9	0	divergent
enum-usage	0	2,693	divergent
named-exports	0	0	match
default-exports	0	0	match
describe-it-tests	23	0	divergent
console-error-logging	44	0	divergent
optional-chaining	13	13	match
nullish-coalescing	41	45	close

Go stdlib (11,022 Go files)

Backend	Time	Detected	Notes
ast-grep	222ms	3 (JS only)	Cannot scan Go
semgrep	87.5s	0	Go supported but no patterns matched threshold

Summary

Metric	ast-grep	semgrep
Speed	12-147x faster	Baseline
TS/JS accuracy	Higher	Lower (pattern syntax differences)
Language support	TypeScript, JavaScript	+ Python, Go, Rust
Install	`npm install @ast-grep/napi`	`pip install semgrep`
Best for	TS/JS projects (default)	Multi-language codebases

Recommendation: Use ast-grep (the default) for TypeScript/JavaScript projects. Use --backend semgrep when you need Python, Go, or Rust scanning.

Design Decisions

Filesystem-native — agents using simple file operations outperform complex memory solutions. No databases, no Docker.
4-op dedup model — ADD/UPDATE/DELETE/NOOP with Jaccard similarity is the sweet spot for memory consolidation
< 150 lines auto-loaded — modular rules reduce noise by 40%, theme-based organization beats chronological
Pattern detection at 2+ occurrences — recurring items auto-promote to higher tiers
Code-aware scanning — ast-grep/semgrep detect conventions directly from source code, not just conversations

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

memex

Features

Quick Start

Install from npm

Initialize in your project

Add memories

Search, audit, and prune

Export to your tools

Scan codebase for conventions

Validate memories against code

Local Development Setup

Prerequisites

Clone and install

Build

Run locally

Test

Project structure

Claude Code Integration

Auto-install hooks

MCP Server

LLM-Powered Features

Extract memories from transcripts

Consolidate (merge + dedup)

LoCoMo Benchmark

Memory Schema

Categories and decay

Scoring

Tier promotion

CLI Reference

Scanner Benchmark: ast-grep vs semgrep

Vercel AI SDK (3,490 TS/JS files)

memex (34 TS files)

Go stdlib (11,022 Go files)

Summary

Design Decisions

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages