Agents
Three agentic flows are provided by haiku.rag:
- Simple QA Agent — a focused question answering agent
- Research Graph — a multi-step research workflow with question decomposition
- RLM Agent — complex analytical tasks via sandboxed Python code execution (see RLM Agent)
For multi-turn conversational RAG, haiku.rag provides skills built on haiku.skills. The skills bundle search, Q&A, analysis, and research tools with session state management.
See QA and Research Configuration for configuring model, iterations, concurrency, and other settings.
Simple QA Agent
The simple QA agent answers a single question using the knowledge base. It retrieves relevant chunks, optionally expands context around them, and asks the model to answer strictly based on that context.
Key points:
- Uses a single
search_documentstool to fetch relevant chunks - Can be run with or without inline citations in the prompt
- Returns a plain string answer
CLI usage:
haiku-rag ask "What is climate change?"
# With citations
haiku-rag ask "What is climate change?" --cite
Python usage:
from haiku.rag.client import HaikuRAG
from haiku.rag.config.models import ModelConfig
from haiku.rag.agents.qa.agent import QuestionAnswerAgent
async with HaikuRAG(path_to_db) as client:
agent = QuestionAnswerAgent(
client=client,
model_config=ModelConfig(provider="openai", name="gpt-4o-mini"),
)
answer, citations = await agent.answer("What is climate change?")
print(answer)
Research Graph
The research workflow is implemented as a typed pydantic-graph. It uses an iterative feedback loop where the planner proposes one question at a time, sees the answer, then decides whether to continue or synthesize.
---
title: Research graph
---
stateDiagram-v2
state plan_next_decision <<choice>>
[*] --> plan_next
plan_next --> plan_next_decision
plan_next_decision --> search_one: Has next question
plan_next_decision --> synthesize: Complete or max iterations
search_one --> plan_next: Answer added to context
synthesize --> [*]
note right of plan_next
Uses prior_answers from previous iterations.
Uses a different prompt when prior answers exist.
end note
The graph receives a ResearchContext containing:
original_question— the user's questionqa_responses— prior answers from previous iterations (injected as<prior_answers>XML)
When prior answers are provided, the planner uses a context-aware prompt that evaluates whether existing evidence is sufficient. If it is, the planner marks is_complete=True and the graph skips directly to synthesis without any searches.
Key nodes:
- plan_next: Evaluates gathered evidence and either proposes the next question to investigate or marks research as complete. Uses a context-aware prompt when prior answers exist, allowing it to skip research entirely.
- search_one: Answers a single question using the knowledge base (up to 3 search calls per question). Each answer is added to
ResearchContext.qa_responsesfor the next planning iteration. - synthesize: Generates the final output from all gathered evidence.
Iterative flow:
- Each iteration: planner evaluates context → proposes one question → search answers it → loop back
- Planner can decompose complex questions (e.g., "benefits and drawbacks" → start with "benefits")
- Prior answers let the planner skip redundant searches
- Loop terminates when planner marks
is_complete=Trueormax_iterationsis reached
CLI Usage
# Basic usage
haiku-rag research "How does haiku.rag organize and query documents?"
# With document filter
haiku-rag research "What are the key findings?" --filter "uri LIKE '%report%'"
Python Usage
Basic example:
from haiku.rag.client import HaikuRAG
from haiku.rag.config import Config
from haiku.rag.agents.research.dependencies import ResearchContext
from haiku.rag.agents.research.graph import build_research_graph
from haiku.rag.agents.research.state import ResearchDeps, ResearchState
async with HaikuRAG(path_to_db) as client:
graph = build_research_graph(config=Config)
context = ResearchContext(original_question="What are the main features?")
state = ResearchState.from_config(context=context, config=Config)
deps = ResearchDeps(client=client)
report = await graph.run(state=state, deps=deps)
print(report.title)
print(report.executive_summary)
With custom config:
from haiku.rag.client import HaikuRAG
from haiku.rag.config.models import AppConfig, ModelConfig, ResearchConfig
from haiku.rag.agents.research.dependencies import ResearchContext
from haiku.rag.agents.research.graph import build_research_graph
from haiku.rag.agents.research.state import ResearchDeps, ResearchState
custom_config = AppConfig(
research=ResearchConfig(
model=ModelConfig(provider="openai", name="gpt-4o-mini"),
max_iterations=5,
max_concurrency=3,
)
)
async with HaikuRAG(path_to_db) as client:
graph = build_research_graph(config=custom_config)
context = ResearchContext(original_question="What are the main features?")
state = ResearchState.from_config(context=context, config=custom_config)
deps = ResearchDeps(client=client)
report = await graph.run(state=state, deps=deps)
Filtering Documents
Restrict searches to specific documents via the search_filter parameter:
# Set filter before running the graph
state = ResearchState.from_config(context=context, config=Config)
state.search_filter = "id IN ('doc-123', 'doc-456')"
report = await graph.run(state=state, deps=deps)
The filter applies to all search operations in the graph. See Filtering Search Results for available filter columns and syntax.