What is ra?
ra is a small, hackable agent. Nothing hidden behind abstractions you can't reach.
It doesn't ship with a system prompt. Every part of the loop is exposed via config and can be extended by writing scripts or plain TypeScript. Middleware hooks let you intercept every step — model calls, tool execution, streaming, all of it.
It talks to Anthropic, OpenAI, Google, Ollama, Bedrock, and Azure. Switch providers with ease.
It comes with built-in tools for filesystem, shell, network, and user interaction. Connect to MCP servers for additional tools. Persistent sessions via JSONL. An FTS5 memory backed by SQLite.
It speaks MCP both ways — use external MCP servers, or expose ra itself as an MCP server so you can use it from Cursor, Claude Desktop, or anything else that speaks the protocol.
It gives you real control over context. Deterministic discovery for common formats (CLAUDE.md, AGENTS.md, README.md), pattern resolution, prompt caching, compaction, token tracking. A skill system that can pull skills from GitHub repos or npm packages.
Extended thinking for models that support it — watch the model reason in real time.
It runs as a CLI, REPL, HTTP server, or MCP server. No runtime dependencies.
Structured logs and traces per session, so you can actually see what your agent is doing.
All of this is configurable via a layered config system — env vars, config files (JSON, YAML, TOML), or CLI flags. Each layer overrides the last.
ra "What is the capital of France?"
ra --provider openai --model gpt-4.1 "Explain this error"
ra --skill code-review --file diff.patch "Review this diff"
cat server.log | ra "Find the root cause of these errors"
ra # interactive REPLThe config is the agent
Drop a ra.config.yml in a repo and that directory becomes a project-specific assistant. Set env vars for a different persona. Pass --skill to inject a role at runtime. Run --mcp-stdio to expose it as a tool for Cursor or Claude Desktop. Same binary, different agent — every time.
What's in the box
| Feature | Description |
|---|---|
| The Agent Loop | Model → tools → repeat, with streaming, middleware hooks at every step, and configurable iteration limits |
| Context Control | Smart compaction, token tracking, prompt caching, extended thinking, context discovery, pattern resolution |
| CLI | One-shot prompts, piping, chaining, scriptable |
| REPL | Interactive sessions with history, slash commands, file attachments |
| HTTP API | Sync and streaming chat, session management |
| MCP | Client (pull tools from MCP servers) and server (expose ra as a tool) |
| Built-in Tools | Filesystem, shell, network, and user interaction tools |
| Skills | Reusable instruction bundles — roles, behaviors, scripts, and reference docs |
| Middleware | Hooks at every loop stage — intercept, modify, or stop the loop |
| Sessions | Persist conversations as JSONL, resume from any interface, auto-prune |
| File Attachments | Images, PDFs, text files — auto-detected and sent in the right format |
| Memory | Persistent SQLite memory with FTS — save, search, forget across conversations |
| Configuration | Layered: defaults → file → env → CLI. The config is the agent |
Use cases
CI caught a flaky test
ra --skill debugger --file test-output.log "Why is this test failing?"Reads the logs, explains the root cause, and exits. Pipe the output to Slack or a PR comment.
You're building a feature
ra
› /attach src/auth.ts
› How should I add rate limiting to this endpoint?Attach files, ask follow-ups, keep context. Resume the session tomorrow with /resume.
Your product needs AI
ra --http --http-port 3000POST a message, get SSE chunks back. No framework — just Bun.serve() under the hood.
Your editor needs a specialist
ra --mcp-stdio --skill code-reviewNow Cursor or Claude Desktop has a dedicated code reviewer that uses your project's style guide, your skills, your system prompt.