Command Line Interface
The haiku-rag CLI provides complete document management functionality.
Note
Global options (must be specified before the command):
--config- Specify custom configuration file--version/-v- Show version and exit
Per-command options:
--db- Specify custom database path-h- Show help for specific command
Example:
Document Management
List Documents
Filter documents by properties:
# Filter by URI pattern
haiku-rag list --filter "uri LIKE '%arxiv%'"
# Filter by exact title
haiku-rag list --filter "title = 'My Document'"
# Combine multiple conditions
haiku-rag list --filter "uri LIKE '%.pdf' AND title LIKE '%paper%'"
Add Documents
From text:
haiku-rag add "Your document content here"
# Attach metadata (repeat --meta for multiple entries)
haiku-rag add "Your document content here" --meta author=alice --meta topic=notes
From file or URL:
haiku-rag add-src /path/to/document.pdf
haiku-rag add-src https://example.com/article.html
# Optionally set a human‑readable title stored in the DB schema
haiku-rag add-src /mnt/data/doc1.pdf --title "Q3 Financial Report"
# Optionally attach metadata (repeat --meta). Values use JSON parsing if possible:
# numbers, booleans, null, arrays/objects; otherwise kept as strings.
haiku-rag add-src /mnt/data/doc1.pdf --meta source=manual --meta page_count=12 --meta published=true
From directory (recursively adds all supported files):
Note
When adding a directory, the same content filters configured for file monitoring are applied. This means ignore_patterns and include_patterns from your configuration will be used to filter which files are added.
Note
As you add documents to haiku.rag the database keeps growing. By default, LanceDB supports versioning
of your data. Create/update operations are atomic‑feeling: if anything fails during chunking or embedding,
the database rolls back to the pre‑operation snapshot using LanceDB table versioning. You can optimize and
compact the database by running the vacuum command.
Get Document
Delete Document
Use this when you want to change things like the embedding model or chunk size for example.
Search
Basic search:
With options:
With filters (filter by document properties):
# Filter by URI pattern
haiku-rag search "neural networks" --filter "uri LIKE '%arxiv%'"
# Filter by exact title
haiku-rag search "transformers" --filter "title = 'Deep Learning Guide'"
# Combine multiple conditions
haiku-rag search "AI" --filter "uri LIKE '%.pdf' AND title LIKE '%paper%'"
Question Answering
Ask questions about your documents:
Ask questions with citations showing source documents:
Use deep QA for complex questions (multi-agent decomposition):
Show verbose output with deep QA:
The QA agent will search your documents for relevant information and provide a comprehensive answer. With --cite, responses include citations showing which documents were used. With --deep, the question is decomposed into sub-questions that are answered in parallel before synthesizing a final answer. With --verbose (only with --deep), you'll see the planning, searching, evaluation, and synthesis steps as they happen.
When available, citations use the document title; otherwise they fall back to the URI.
Research
Run the multi-step research graph:
With verbose output to see progress:
Flags:
- --verbose: Show planning, searching previews, evaluation summary, and stop reason
Research parameters like max_iterations, confidence_threshold, and max_concurrency are configured in your configuration file under the research section.
When --verbose is set, the CLI consumes the research graph's AG-UI event stream, displaying step events and activity snapshots as agents progress through planning, search, evaluation, and synthesis. Without --verbose, only the final research report is displayed.
If you build your own integration, import stream_graph from haiku.rag.graph.agui to access AG-UI events (STEP_STARTED, ACTIVITY_SNAPSHOT, STATE_SNAPSHOT, RUN_FINISHED, etc.) and render them however you like while the graph is running.
Server
Start services (requires at least one flag):
# MCP server only (HTTP transport)
haiku-rag serve --mcp
# MCP server (stdio transport)
haiku-rag serve --mcp --stdio
# File monitoring only
haiku-rag serve --monitor
# AG-UI server only
haiku-rag serve --agui
# Multiple services
haiku-rag serve --monitor --mcp
haiku-rag serve --monitor --agui
haiku-rag serve --mcp --agui
# All services
haiku-rag serve --monitor --mcp --agui
# Custom MCP port
haiku-rag serve --mcp --mcp-port 9000
See Server Mode for details on available services.
Settings
View current configuration settings:
Database Management
Initialize Database
Create a new database:
This creates the database with the configured settings. All other commands require an existing database - they will fail with an informative error if the database doesn't exist.
Info
Display database metadata:
Shows: - path to the database - stored haiku.rag version (from settings) - embeddings provider/model and vector dimension - number of documents and chunks (with storage sizes) - vector index status (exists/not created, indexed/unindexed chunks) - table versions per table (documents, chunks)
At the end, a separate "Versions" section lists runtime package versions: - haiku.rag - lancedb - docling
Create Vector Index
Create a vector index on the chunks table for fast approximate nearest neighbor search:
Requirements:
- Minimum 256 chunks required for index creation (LanceDB training data requirement)
- Creates an IVF_PQ index using the configured search.vector_index_metric (cosine/l2/dot)
When to use:
- After ingesting documents (indexes are not created automatically)
- After adding significant new data to rebuild the index
- Use haiku-rag info to check index status and see how many chunks are indexed/unindexed
Search behavior: - Without index: Brute-force kNN search (exact nearest neighbors, slower for large datasets) - With index: Fast ANN (approximate nearest neighbors) using IVF_PQ - With stale index: LanceDB combines indexed results (fast ANN) + brute-force kNN on unindexed rows - Performance degrades as more unindexed data accumulates
Vacuum (Optimize and Cleanup)
Reduce disk usage by optimizing and pruning old table versions across all tables:
Automatic Cleanup: Vacuum runs automatically in the background after document operations. By default, it removes versions older than 1 day (configurable via storage.vacuum_retention_seconds), preserving recent versions for concurrent connections. Manual vacuum can be useful for cleanup after bulk operations or to free disk space immediately.
Rebuild Database
Rebuild the database by re-indexing documents. Useful when switching embeddings provider/model or changing chunking settings:
# Full rebuild (default) - re-converts from source files, re-chunks, re-embeds
haiku-rag rebuild
# Re-chunk from stored content (no source file access)
haiku-rag rebuild --rechunk
# Only regenerate embeddings (fastest, keeps existing chunks)
haiku-rag rebuild --embed-only
Rebuild modes:
| Mode | Flag | Use case |
|---|---|---|
| Full | (default) | Changed converter, source files updated |
| Rechunk | --rechunk |
Changed chunking strategy or chunk size |
| Embed only | --embed-only |
Changed embedding model or vector dimensions |
Download Models
Download required runtime models:
This command: - Downloads Docling OCR/conversion models (no-op if already present). - Pulls Ollama models referenced in your configuration (embeddings, QA, research, rerank).