GitHub Copilot CLI
Copilot CLI integrates via an MCP (Model Context Protocol) server that
Isartor registers as a stdio subprocess. Isartor also exposes the same MCP
tools over Streamable HTTP at http://localhost:8080/mcp/ for editors and
web agents that prefer HTTP/SSE transport. Both transports expose two tools:
isartor_chat— cache lookup only. Returns the cached answer on hit (L1a exact or L1b semantic), or an empty string on miss. On a miss, Copilot uses its own LLM to answer — Isartor never routes through its configured L3 provider for Copilot traffic.isartor_cache_store— stores a prompt/response pair in Isartor's cache so future identical or similar prompts are deflected locally.
This design means Copilot still owns the conversation loop, while Isartor acts as a transparent cache layer that reduces redundant cloud calls. On a cache hit, Isartor returns the cached text and does not call its own Layer 3 provider. Copilot CLI may still emit its normal final-answer event after the tool result, but that is a Copilot-side render step rather than an Isartor L3 forward.
Prerequisites
- Isartor installed (
curl -fsSL https://raw.githubusercontent.com/isartor-ai/Isartor/main/install.sh | sh) - GitHub Copilot CLI installed
Step-by-step setup
# 1. Start Isartor
isartor up --detach
# 2. Register the MCP server with Copilot CLI
isartor connect copilot
# 3. Start Copilot normally — plain chat prompts will use Isartor cache first
copilot
How it works
isartor connect copilotadds anisartorentry to~/.copilot/mcp-config.jsonisartor connect copilotalso installs a managed instruction block in~/.copilot/copilot-instructions.md- When Copilot CLI starts, it launches
isartor mcpas a stdio subprocess and loads the Isartor instruction block - The MCP server exposes
isartor_chat(cache lookup) andisartor_cache_store(cache write) - For plain conversational prompts, Copilot now prefers this flow:
- Call
isartor_chatwith the user's prompt - Cache hit: return the cached answer immediately, verbatim
- Cache miss: answer with Copilot's own model, then call
isartor_cache_store
- Call
- When Copilot calls
isartor_chat:- Cache hit (L1a exact or L1b semantic): returns the cached answer instantly
- Cache miss: returns empty → Copilot uses its own LLM
- After Copilot gets an answer from its LLM, it can call
isartor_cache_storeto populate the cache for future requests
HTTP/SSE MCP endpoint
Isartor now exposes the same MCP tool surface at /mcp/ using Streamable HTTP:
POST /mcp/— client → server JSON-RPCGET /mcp/— server → client SSE streamDELETE /mcp/— explicit session teardown
The HTTP transport uses the MCP Mcp-Session-Id header after initialize, and
supports both JSON responses and SSE responses for POST requests. A minimal
editor config looks like:
{"servers":{"isartor":{"type":"http","url":"http://localhost:8080/mcp/"}}}
Important note about "still going to L3"
If you inspect Copilot CLI JSON traces, you may still see a normal
final_answer event after isartor_chat returns a cache hit. That does not
mean Isartor forwarded the prompt to its own Layer 3 provider. The important
signal is Isartor's own log and headers:
Cache lookup: L1a exact hitorCache lookup: L1b semantic hit- no new
Layer 3: Forwarding to LLM via Rigentry for that prompt
In other words:
- Isartor L3 call = bad for a cache hit
- Copilot final-answer render after a tool hit = expected CLI behavior
Isartor now installs stricter Copilot instructions that tell Copilot to emit the cached tool result verbatim on cache hits, without paraphrasing or extra tool calls.
Cache endpoints (used by MCP internally)
The MCP server calls these HTTP endpoints on the Isartor gateway:
# Cache lookup — returns cached response or 204 No Content
curl -X POST http://localhost:8080/api/v1/cache/lookup \
-H "Content-Type: application/json" \
-d '{"prompt": "capital of France"}'
# Cache store — saves a prompt/response pair
curl -X POST http://localhost:8080/api/v1/cache/store \
-H "Content-Type: application/json" \
-d '{"prompt": "capital of France", "response": "The capital of France is Paris."}'
Custom gateway URL
# If Isartor runs on a non-default port
isartor connect copilot --gateway-url http://localhost:18080
Disconnecting
isartor connect copilot --disconnect
This removes the isartor entry from ~/.copilot/mcp-config.json.
It also removes the managed Isartor block from ~/.copilot/copilot-instructions.md.
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
Copilot has no isartor_chat tool | MCP server not registered | Run isartor connect copilot |
| Copilot works but bypasses cache | Isartor instructions not installed or custom instructions disabled | Run isartor connect copilot again and do not launch Copilot with --no-custom-instructions |
| Cache never hits for Copilot | Responses not stored after LLM answers | Ask Copilot to call isartor_cache_store after answering |