Command Line Interface
The haiku-rag CLI provides complete document management functionality.
Note
Global options (must be specified before the command):
--config- Specify custom configuration file--read-only- Open database in read-only mode (blocks writes, skips upgrades)--before- Query database as it existed before a datetime (implies--read-only)--version/-v- Show version and exit
Per-command options:
--db- Specify custom database path-h- Show help for specific command
Example:
Document Management
List Documents
Filter documents by properties:
# Filter by URI pattern (--filter or -f)
haiku-rag list --filter "uri LIKE '%arxiv%'"
# Filter by exact title
haiku-rag list --filter "title = 'My Document'"
# Combine multiple conditions
haiku-rag list --filter "uri LIKE '%.pdf' AND title LIKE '%paper%'"
Add Documents
From text:
haiku-rag add "Your document content here"
# Attach metadata (repeat --meta for multiple entries)
haiku-rag add "Your document content here" --meta author=alice --meta topic=notes
From file or URL:
haiku-rag add-src /path/to/document.pdf
haiku-rag add-src https://example.com/article.html
# Optionally set a human‑readable title stored in the DB schema
haiku-rag add-src /mnt/data/doc1.pdf --title "Q3 Financial Report"
# Optionally attach metadata (repeat --meta). Values use JSON parsing if possible:
# numbers, booleans, null, arrays/objects; otherwise kept as strings.
haiku-rag add-src /mnt/data/doc1.pdf --meta source=manual --meta page_count=12 --meta published=true
From directory (recursively adds all supported files):
Note
When adding a directory, the same content filters configured for file monitoring are applied. This means ignore_patterns and include_patterns from your configuration will be used to filter which files are added.
Note
As you add documents to haiku.rag the database keeps growing. By default, LanceDB supports versioning
of your data. Create/update operations are atomic‑feeling: if anything fails during chunking or embedding,
the database rolls back to the pre‑operation snapshot using LanceDB table versioning. You can optimize and
compact the database by running the vacuum command.
Get Document
Delete Document
Visualize Chunk
Display visual grounding for a chunk - shows page images with highlighted bounding boxes:
This renders the source document pages with the chunk's location highlighted. Useful for verifying chunk boundaries and understanding document structure.
Note
Requires a terminal with image support (iTerm2, Kitty, WezTerm, etc.) and documents processed with docling that have page images stored.
Search
Basic search:
With options:
With filters (filter by document properties, use --filter or -f):
# Filter by URI pattern
haiku-rag search "neural networks" --filter "uri LIKE '%arxiv%'"
# Filter by exact title
haiku-rag search "transformers" --filter "title = 'Deep Learning Guide'"
# Combine multiple conditions
haiku-rag search "AI" --filter "uri LIKE '%.pdf' AND title LIKE '%paper%'"
Question Answering
Ask questions about your documents:
Ask questions with citations showing source documents:
Use deep QA for complex questions (multi-agent decomposition):
Filter to specific documents:
Provide background context for the question:
haiku-rag ask "What are the protocols?" --context "Focus on security best practices"
haiku-rag ask "Summarize the findings" --context-file background.txt
The QA agent searches your documents for relevant information and provides a comprehensive answer. When available, citations use the document title; otherwise they fall back to the URI.
Flags:
--cite: Include citations showing which documents were used--deep: Decompose the question into sub-questions answered in parallel before synthesizing a final answer--filter/-f: Restrict searches to documents matching the filter (see Filtering Search Results)--context: Background context for the question (passed to the agent as system context)--context-file: Path to a file containing background context
Chat
Launch an interactive chat session for multi-turn conversations:
Provide background context for the conversation:
haiku-rag chat --context "Focus on Python programming concepts"
haiku-rag chat --context-file domain-context.txt
Note
Requires the tui extra: pip install haiku.rag-slim[tui] (included in full haiku.rag package)
The chat interface provides:
- Streaming responses with real-time tool execution
- Expandable citations with source metadata
- Session memory for context-aware follow-up questions
- Visual grounding to inspect chunk source locations
- Background context that persists across the entire conversation
Flags:
--context: Background context for the conversation--context-file: Path to a file containing background context
See Applications for keyboard shortcuts and features.
Inspect
Launch the interactive inspector TUI for browsing documents and chunks:
Note
Requires the tui extra: pip install haiku.rag-slim[tui] (included in full haiku.rag package)
The inspector provides:
- Browse all documents in the database
- View document metadata and content
- Explore individual chunks
- Search and filter results
See Applications for details.
Research
Run the multi-step research graph:
Filter to specific documents:
Provide background context for the research:
haiku-rag research "What are the safety protocols?" --context "Industrial manufacturing context"
haiku-rag research "Analyze the methodology" --context-file research-background.txt
Flags:
--filter/-f: SQL WHERE clause to filter documents (see Filtering Search Results)--context: Background context for the research--context-file: Path to a file containing background context
Research parameters like max_iterations, confidence_threshold, and max_concurrency are configured in your configuration file under the research section.
Server
Start services (requires at least one flag):
# MCP server only (HTTP transport)
haiku-rag serve --mcp
# MCP server (stdio transport)
haiku-rag serve --mcp --stdio
# File monitoring only
haiku-rag serve --monitor
# Both services
haiku-rag serve --monitor --mcp
# Custom MCP port
haiku-rag serve --mcp --mcp-port 9000
# Read-only mode (excludes write MCP tools, disables monitor)
haiku-rag --read-only serve --mcp
See Server Mode for details on available services.
Settings
View current configuration settings:
Generate Configuration File
Generate a YAML configuration file with defaults:
If no path is specified, creates haiku.rag.yaml in the current directory.
Database Management
Initialize Database
Create a new database:
This creates the database with the configured settings. All other commands require an existing database - they will fail with an informative error if the database doesn't exist.
Migrate Database
Apply pending database migrations:
When you upgrade haiku.rag to a new version that includes schema changes, the database requires migration. Opening a database with pending migrations will display an error:
Error: Database requires migration from 0.19.0 to 0.26.5. 3 migration(s) pending. Run 'haiku-rag migrate' to upgrade.
Run haiku-rag migrate to apply the pending migrations. The command shows which migrations were applied:
Applied 3 migration(s):
- 0.20.0: Add 'docling_document_json' and 'docling_version' columns
- 0.23.1: Add content_fts column for contextualized FTS search
- 0.25.0: Compress docling_document with gzip
Migration completed successfully.
Tip
Back up your database before running migrations. While migrations are designed to be safe, having a backup provides peace of mind for production databases.
Info
Display database metadata:
Shows: - path to the database - stored haiku.rag version (from settings) - embeddings provider/model and vector dimension - number of documents and chunks (with storage sizes) - vector index status (exists/not created, indexed/unindexed chunks) - table versions per table (documents, chunks)
At the end, a separate "Versions" section lists runtime package versions: - haiku.rag - lancedb - docling
Create Vector Index
Create a vector index on the chunks table for fast approximate nearest neighbor search:
Requirements:
- Minimum 256 chunks required for index creation (LanceDB training data requirement)
- Creates an IVF_PQ index using the configured search.vector_index_metric (cosine/l2/dot)
When to use:
- After ingesting documents (indexes are not created automatically)
- After adding significant new data to rebuild the index
- Use haiku-rag info to check index status and see how many chunks are indexed/unindexed
Search behavior: - Without index: Brute-force kNN search (exact nearest neighbors, slower for large datasets) - With index: Fast ANN (approximate nearest neighbors) using IVF_PQ - With stale index: LanceDB combines indexed results (fast ANN) + brute-force kNN on unindexed rows - Performance degrades as more unindexed data accumulates
Vacuum (Optimize and Cleanup)
Reduce disk usage by optimizing and pruning old table versions across all tables:
Automatic Cleanup: Vacuum runs automatically in the background after document operations. By default, it removes versions older than 1 day (configurable via storage.vacuum_retention_seconds), preserving recent versions for concurrent connections. Manual vacuum can be useful for cleanup after bulk operations or to free disk space immediately.
Rebuild Database
Rebuild the database by re-indexing documents. Useful when switching embeddings provider/model or changing chunking settings:
# Full rebuild (default) - re-converts from source files, re-chunks, re-embeds
haiku-rag rebuild
# Re-chunk from stored content (no source file access)
haiku-rag rebuild --rechunk
# Only regenerate embeddings (fastest, keeps existing chunks)
haiku-rag rebuild --embed-only
Rebuild modes:
| Mode | Flag | Use case |
|---|---|---|
| Full | (default) | Changed converter, source files updated |
| Rechunk | --rechunk |
Changed chunking strategy or chunk size |
| Embed only | --embed-only |
Changed embedding model or vector dimensions |
Download Models
Download required runtime models:
This command downloads:
- Docling OCR/conversion models
- HuggingFace tokenizer (for chunking)
- Ollama models referenced in your configuration (embeddings, QA, research, rerank)
Progress is displayed in real-time with download status and progress bars for Ollama model pulls.
Time Travel
LanceDB maintains version history for tables, enabling you to query the database as it existed at a previous point in time. This is useful for:
- Debugging: Investigate data before a problematic change
- Auditing: Verify what knowledge was available when a support ticket was filed
Query Historical State
Use --before to query the database as it existed before a specific datetime:
# Query documents as of January 15, 2025
haiku-rag --before "2025-01-15" list
# Search historical state
haiku-rag --before "2025-01-15T14:30:00" search "machine learning"
# Ask questions against historical data
haiku-rag --before "2025-01-15" ask "What documents existed?"
Supported datetime formats:
- ISO 8601:
2025-01-15T14:30:00,2025-01-15T14:30:00Z,2025-01-15T14:30:00+00:00 - Date only:
2025-01-15(interpreted as start of day)
Note
Time travel mode automatically enables read-only mode. You cannot modify the database while viewing historical state.
Version History
View version history for database tables:
# Show history for all tables
haiku-rag history
# Show history for a specific table
haiku-rag history --table documents
# Limit number of versions shown
haiku-rag history --limit 10
Output shows version numbers and timestamps, sorted newest first:
Version History
documents
v5: 2025-01-15 14:30:00
v4: 2025-01-14 10:00:00
v3: 2025-01-13 09:15:00
chunks
v8: 2025-01-15 14:30:00
v7: 2025-01-14 10:00:00
...
Use the timestamps from history to construct --before queries.