RLM Agent (Recursive Language Model)
The RLM agent enables complex analytical tasks by writing and executing Python code in a sandboxed environment. It solves problems that traditional RAG struggles with:
- Aggregation: "How many documents mention security vulnerabilities?"
- Computation: "What's the average revenue across all quarterly reports?"
- Multi-document analysis: "Compare the key findings between Report A and Report B"
- Structured data extraction: "Extract all dollar amounts and compute totals"
How It Works
- The agent receives a question
- It writes Python code to explore the knowledge base
- Code executes in a sandboxed Python interpreter with access to knowledge base functions
- The agent iterates: run code, examine results, refine approach
- Final answer is synthesized from the gathered data
CLI Usage
# Basic usage
haiku-rag rlm "How many documents are in the database?"
# With document filter (restricts what the agent can access)
haiku-rag rlm "Summarize the key points" --filter "uri LIKE '%report%'"
# Pre-load specific documents
haiku-rag rlm "Compare these two reports" --document "Q1 Report" --document "Q2 Report"
Python Usage
from haiku.rag.client import HaikuRAG
async with HaikuRAG(path_to_db) as client:
# Basic question
result = await client.rlm("How many documents mention 'security'?")
print(result.answer) # The answer
print(result.program) # The final consolidated program
# With filter (agent can only see filtered documents)
result = await client.rlm(
"What is the total revenue?",
filter="title LIKE '%Financial%'"
)
# Pre-load specific documents
result = await client.rlm(
"Compare the conclusions",
documents=["Report A", "Report B"]
)
Sandbox Capabilities
The agent's code runs in a sandboxed Python interpreter (pydantic-monty) with access to these knowledge base functions:
| Function | Description |
|---|---|
search(query, limit) |
Hybrid search (vector + full-text) returning matching chunks with scores |
list_documents(limit, offset) |
List documents in the knowledge base |
get_document(id_or_title) |
Get full text content of a document |
get_chunk(chunk_id) |
Get a chunk with metadata (headings, page numbers, labels) for citations |
get_docling_document(document_id) |
Get the full DoclingDocument structure as a dict (texts, tables, pictures, pages) |
llm(prompt) |
Call an LLM for classification, summarization, or extraction |
regex_findall(pattern, text), regex_sub(pattern, repl, text), regex_search(pattern, text), regex_split(pattern, text) |
Regular expression matching via Python's re module |
When documents are pre-loaded via the documents parameter, they are injected as a documents variable accessible in the sandbox code.
Python Features
The interpreter supports a subset of Python: variables, arithmetic, strings, f-strings, lists, dicts, tuples, sets, loops, conditionals, comprehensions, functions, async/await, try/except, and the json module.
Not supported: imports (other than json), class definitions, generators/yield, match statements, decorators, with statements. For pattern matching, the agent can use the regex_* functions, string methods, or the llm() function.
Security
Code executes in an isolated interpreter with:
- No filesystem access: Code cannot read or write files
- No network access: Code cannot make HTTP requests or open sockets
- No imports: Only the
jsonmodule is available - Execution timeout: Configurable limit (default 60s)
- Output truncation: Large outputs are truncated to prevent memory issues
Context Filter
The filter parameter restricts what documents the agent can access. Unlike tool parameters, the filter is applied automatically and cannot be bypassed by the LLM:
# Agent can only see documents with "confidential" in the URI
result = await client.rlm(
"Summarize all findings",
filter="uri LIKE '%confidential%'"
)
This is useful for scoping to specific document sets, enforcing access control, or limiting context for focused analysis.
Configuration
RLM settings can be configured in haiku.rag.yaml: