Information
# Context Mode
**The other half of the context problem.**
[](https://www.npmjs.com/package/context-mode) [](https://www.npmjs.com/package/context-mode) [](https://github.com/mksglu/context-mode) [](https://github.com/mksglu/context-mode/stargazers) [](https://github.com/mksglu/context-mode/network/members) [](https://github.com/mksglu/context-mode/commits) [](LICENSE)
[](https://discord.gg/DCN9jUgN5v)
## The Problem
Every MCP tool call dumps raw data into your context window. A Playwright snapshot costs 56 KB. Twenty GitHub issues cost 59 KB. One access log — 45 KB. After 30 minutes, 40% of your context is gone. And when the agent compacts the conversation to free space, it forgets which files it was editing, what tasks are in progress, and what you last asked for.
Context Mode is an MCP server that solves both halves of this problem:
1. **Context Saving** — Sandbox tools keep raw data out of the context window. 315 KB becomes 5.4 KB. 98% reduction.
2. **Session Continuity** — Every file edit, git operation, task, error, and user decision is tracked in SQLite. When the conversation compacts, context-mode doesn't dump this data back into context — it indexes events into FTS5 and retrieves only what's relevant via BM25 search. The model picks up exactly where you left off. If you don't \`--continue\`, previous session data is deleted immediately — a fresh session means a clean slate.
https://github.com/user-attachments/assets/07013dbf-07c0-4ef1-974a-33ea1207637b
## Install
## Tools
| Tool | What it does | Context saved |
|---|---|---|
| \`ctx_batch_execute\` | Run multiple commands + search multiple queries in ONE call. | 986 KB → 62 KB |
| \`ctx_execute\` | Run code in 11 languages. Only stdout enters context. | 56 KB → 299 B |
| \`ctx_execute_file\` | Process files in sandbox. Raw content never leaves. | 45 KB → 155 B |
| \`ctx_index\` | Chunk markdown into FTS5 with BM25 ranking. | 60 KB → 40 B |
| \`ctx_search\` | Query indexed content with multiple queries in one call. | On-demand retrieval |
| \`ctx_fetch_and_index\` | Fetch URL, detect content type (HTML/JSON/text), chunk and index. | 60 KB → 40 B |
| \`ctx_stats\` | Show context savings, call counts, and session statistics. | — |
| \`ctx_doctor\` | Diagnose installation: runtimes, hooks, FTS5, versions. | — |
| \`ctx_upgrade\` | Upgrade to latest version from GitHub, rebuild, reconfigure hooks. | — |
## How the Sandbox Works
Each \`ctx_execute\` call spawns an isolated subprocess with its own process boundary. Scripts can't access each other's memory or state. The subprocess runs your code, captures stdout, and only that stdout enters the conversation context. The raw data — log files, API responses, snapshots — never leaves the sandbox.
Eleven language runtimes are available: JavaScript, TypeScript, Python, Shell, Ruby, Go, Rust, PHP, Perl, R, and Elixir. Bun is auto-detected for 3-5x faster JS/TS execution.
Authenticated CLIs work through credential passthrough — \`gh\`, \`aws\`, \`gcloud\`, \`kubectl\`, \`docker\` inherit environment variables and config paths without exposing them to the conversation.
When output exceeds 5 KB and an \`intent\` is provided, Context Mode switches to intent-driven filtering: it indexes the full output into the knowledge base, searches for sections matching your intent, and returns only the relevant matches with a vocabulary of searchable terms for follow-up queries.
## How the Knowledge Base Works
The \`ctx_index\` tool chunks markdown content by headings while keeping code blocks intact, then stores them in a **SQLite FTS5** (Full-Text Search 5) virtual table. Search uses **BM25 ranking** — a probabilistic relevance algorithm that scores documents based on term frequency, inverse document frequency, and document length normalization. **Porter stemming** is applied at index time so "running", "runs", and "ran" match the same stem.
When you call \`ctx_search\`, it returns relevant content snippets focused around matching query terms — not full documents, not approximations, the actual indexed content with smart extraction around what you're looking for. \`ctx_fetch_and_index\` extends this to URLs: fetch, convert HTML to markdown, chunk, index. The raw page never enters context.
### Fuzzy Search
Search uses a three-layer fallback to handle typos, partial terms, and substring matches:
- **Layer 1 — Porter stemming**: Standard FTS5 MATCH with porter tokenizer. "caching" matches "cached", "caches", "cach".
- **Layer 2 — Trigram substring**: FTS5 trigram tokenizer matches partial strings. "useEff" finds "useEffect", "authenticat" finds "authentication".
- **Layer 3 — Fuzzy correction**: Levenshtein distance corrects typos before re-searching. "kuberntes" → "kubernetes", "autentication" → "authentication".
### Smart Snippets
Search results use intelligent extraction instead of truncation. Instead of returning the first N characters (which might miss the important part), Context Mode finds where your query terms appear in the content and returns windows around those matches.
### Progressive Throttling
- **Calls 1-3:** Normal results (2 per query)
- **Calls 4-8:** Reduced results (1 per query) + warning
- **Calls 9+:** Blocked — redirects to \`ctx_batch_execute\`
## Session Continuity
When the context window fills up, the agent compacts the conversation — dropping older messages to make room. Without session tracking, the model forgets which files it was editing, what tasks are in progress, what errors were resolved, and what you last asked for.
Context Mode captures every meaningful event during your session and persists them in a per-project SQLite database. When the conversation compacts (or you resume with \`--continue\`), your working state is rebuilt automatically — the model continues from your last prompt without asking you to repeat anything.
Session continuity requires 4 hooks working together:
| Hook | Role | Claude Code | Gemini CLI | VS Code Copilot | OpenCode | Codex CLI |
|---|---|:---:|:---:|:---:|:---:|:---:|
| **PostToolUse** | Captures events after each tool call | Yes | Yes | Yes | Plugin | -- |
| **UserPromptSubmit** | Captures user decisions and corrections | Yes | -- | -- | -- | -- |
| **PreCompact** | Builds snapshot before compaction | Yes | Yes | Yes | Plugin | -- |
| **SessionStart** | Restores state after compaction | Yes | Yes | Yes | -- | -- |
| | **Session completeness** | **Full** | **High** | **High** | **High** | **--** |
> **Note:** Full session continuity (capture + snapshot + restore) works on **Claude Code**, **Gemini CLI**, **VS Code Copilot**, and **OpenCode**. OpenCode uses the \`experimental.session.compacting\` plugin hook for compaction recovery — SessionStart is not yet available ([#14808](https://github.com/sst/opencode/issues/14808)), so startup/resume is not supported. Codex CLI has no hook support, so session tracking is not available.
directive into context
→ Model continues from last user prompt with full working state
\`\`\`
The snapshot is built in priority tiers — if the 2 KB budget is tight, lower-priority events (intent, MCP tool counts) are dropped first while critical state (active files, tasks, rules, decisions) is always preserved.
After compaction, the model receives a **Session Guide** — a structured narrative with actionable sections:
- **Last Request** — user's last prompt, so the model continues without asking "what were we doing?"
- **Tasks** — checkbox format with completion status (\`[x]\` completed, \`[ ]\` pending)
- **Key Decisions** — user corrections and preferences ("use X instead", "don't do Y")
- **Files Modified** — all files touched during the session
- **Unresolved Errors** — errors that haven't been fixed
- **Git** — operations performed (checkout, commit, push, status)
- **Project Rules** — CLAUDE.md / GEMINI.md / AGENTS.md paths
- **MCP Tools Used** — tool names with call counts
- **Subagent Tasks** — delegated work summaries
- **Skills Used** — slash commands invoked
- **Environment** — working directory, env variables
- **Data References** — large data pasted during the session
- **Session Intent** — mode classification (implement, investigate, review, discuss)
- **User Role** — behavioral directives set during the session
Detailed event data is also indexed into FTS5 for on-demand retrieval via \`search()\`.
## Platform Compatibility
| Feature | Claude Code | Gemini CLI (Beta) | VS Code Copilot (Beta) | OpenCode (Beta) | Codex CLI (Beta) |
|---|:---:|:---:|:---:|:---:|:---:|
| MCP Server | Yes | Yes | Yes | Yes | Yes |
| PreToolUse Hook | Yes | Yes | Yes | Plugin | -- |
| PostToolUse Hook | Yes | Yes | Yes | Plugin | -- |
| SessionStart Hook | Yes | Yes | Yes | -- | -- |
| Can Modify Args | Yes | Yes | Yes | Plugin | -- |
| Can Block Tools | Yes | Yes | Yes | Plugin | -- |
| Utility Commands (ctx) | Yes | Yes | Yes | Yes | Yes |
| Slash Commands | Yes | -- | -- | -- | -- |
| Plugin Marketplace | Yes | -- | -- | -- | -- |
> **OpenCode** uses a TypeScript plugin paradigm — hooks run as in-process functions via \`tool.execute.before\`, \`tool.execute.after\`, and \`experimental.session.compacting\`, providing the same routing enforcement and session continuity as shell-based hooks. SessionStart is not yet available ([#14808](https://github.com/sst/opencode/issues/14808)), but compaction recovery works via the plugin's compacting hook.
>
> **Codex CLI** does not support hooks. It relies solely on routing instruction files (\`AGENTS.md\`) for enforcement (~60% compliance).
### Routing Enforcement
Hooks intercept tool calls programmatically — they can block dangerous commands and redirect them to the sandbox before execution. Instruction files guide the model via prompt instructions but cannot block anything. **Always enable hooks where supported.**
| Platform | Hooks | Instruction File | With Hooks | Without Hooks |
|---|:---:|---|:---:|:---:|
| Claude Code | Yes (auto) | [\`CLAUDE.md\`](configs/claude-code/CLAUDE.md) | **~98% saved** | ~60% saved |
| Gemini CLI | Yes | [\`GEMINI.md\`](configs/gemini-cli/GEMINI.md) | **~98% saved** | ~60% saved |
| VS Code Copilot | Yes | [\`copilot-instructions.md\`](configs/vscode-copilot/copilot-instructions.md) | **~98% saved** | ~60% saved |
| OpenCode | Plugin | [\`AGENTS.md\`](configs/opencode/AGENTS.md) | **~98% saved** | ~60% saved |
| Codex CLI | -- | [\`AGENTS.md\`](configs/codex/AGENTS.md) | -- | ~60% saved |
Without hooks, one unrouted \`curl\` or Playwright snapshot can dump 56 KB into context — wiping out an entire session's worth of savings.
See [\`docs/platform-support.md\`](docs/platform-support.md) for the full capability comparison.
## Utility Commands
**Inside any AI session** — just type the command. The LLM calls the MCP tool automatically:
\`\`\`
ctx stats → context savings, call counts, session report
ctx doctor → diagnose runtimes, hooks, FTS5, versions
ctx upgrade → update from GitHub, rebuild, reconfigure hooks
\`\`\`
**From your terminal** — run directly without an AI session:
\`\`\`bash
context-mode doctor
context-mode upgrade
\`\`\`
Works on **all platforms**. On Claude Code, slash commands (\`/ctx-stats\`, \`/ctx-doctor\`, \`/ctx-upgrade\`) are also available.
## Benchmarks
| Scenario | Raw | Context | Saved |
|---|---|---|---|
| Playwright snapshot | 56.2 KB | 299 B | 99% |
| GitHub Issues (20) | 58.9 KB | 1.1 KB | 98% |
| Access log (500 requests) | 45.1 KB | 155 B | 100% |
| Context7 React docs | 5.9 KB | 261 B | 96% |
| Analytics CSV (500 rows) | 85.5 KB | 222 B | 100% |
| Git log (153 commits) | 11.6 KB | 107 B | 99% |
| Test output (30 suites) | 6.0 KB | 337 B | 95% |
| Repo research (subagent) | 986 KB | 62 KB | 94% |
Over a full session: 315 KB of raw output becomes 5.4 KB. Session time extends from ~30 minutes to ~3 hours.
[Full benchmark data with 21 scenarios →](BENCHMARK.md)
## Try It
These prompts work out of the box. Run \`/context-mode:ctx-stats\` after each to see the savings.
**Deep repo research** — 5 calls, 62 KB context (raw: 986 KB, 94% saved)
\`\`\`
Research https://github.com/modelcontextprotocol/servers — architecture, tech stack,
top contributors, open issues, and recent activity. Then run /context-mode:ctx-stats.
\`\`\`
**Git history analysis** — 1 call, 5.6 KB context
\`\`\`
Clone https://github.com/facebook/react and analyze the last 500 commits:
top contributors, commit frequency by month, and most changed files.
Then run /context-mode:ctx-stats.
\`\`\`
**Web scraping** — 1 call, 3.2 KB context
\`\`\`
Fetch the Hacker News front page, extract all posts with titles, scores,
and domains. Group by domain. Then run /context-mode:ctx-stats.
\`\`\`
**Large JSON API** — 7.5 MB raw → 0.9 KB context (99% saved)
\`\`\`
Create a local server that returns a 7.5 MB JSON with 20,000 records and a secret
hidden at index 13000. Fetch the endpoint, find the hidden record, and show me
exactly what's in it. Then run /context-mode:ctx-stats.
\`\`\`
**Documentation search** — 2 calls, 1.8 KB context
\`\`\`
Fetch the React useEffect docs, index them, and find the cleanup pattern
with code examples. Then run /context-mode:ctx-stats.
\`\`\`
**Session continuity** — compaction recovery with full state
\`\`\`
Start a multi-step task: "Create a REST API with Express — add routes, tests,
and error handling." After 20+ tool calls, type: ctx stats to see the session
event count. When context compacts, the model continues from your last prompt
with tasks, files, and decisions intact — no re-prompting needed.
\`\`\`
## Security
Context Mode enforces the same permission rules you already use — but extends them to the MCP sandbox. If you block \`sudo\`, it's also blocked inside \`ctx_execute\`, \`ctx_execute_file\`, and \`ctx_batch_execute\`.
**Zero setup required.** If you haven't configured any permissions, nothing changes. This only activates when you add rules.
\`\`\`json
\{
"permissions": \{
"deny": [
"Bash(sudo *)",
"Bash(rm -rf /*)",
"Read(.env)",
"Read(**/.env*)"
],
"allow": [
"Bash(git:*)",
"Bash(npm:*)"
]
\}
\}
\`\`\`
Add this to your project's \`.claude/settings.json\` (or \`~/.claude/settings.json\` for global rules). All platforms read security policies from Claude Code's settings format — even on Gemini CLI, VS Code Copilot, and OpenCode. Codex CLI has no hook support, so security enforcement is not available.
The pattern is \`Tool(what to match)\` where \`*\` means "anything".
Commands chained with \`&&\`, \`;\`, or \`|\` are split — each part is checked separately. \`echo hello && sudo rm -rf /tmp\` is blocked because the \`sudo\` part matches the deny rule.
**deny** always wins over **allow**. More specific (project-level) rules override global ones.
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md) for the development workflow and TDD guidelines.
\`\`\`bash
git clone https://github.com/mksglu/context-mode.git
cd context-mode && npm install && npm test
\`\`\`
## License
[Elastic License 2.0 (ELv2)](LICENSE) — free to use, modify, and share. You may not rebrand and redistribute this software as a competing plugin, product, or managed service.