Configuration
Configuration is done through the use of environment variables.
Note
If you create a db with certain settings and later change them, haiku.rag
will detect incompatibilities (for example, if you change embedding provider) and will exit. You can rebuild the database to apply the new settings, see Rebuild Database.
File Monitoring
Set directories to monitor for automatic indexing:
# Monitor single directory
MONITOR_DIRECTORIES="/path/to/documents"
# Monitor multiple directories
MONITOR_DIRECTORIES="/path/to/documents,/another_path/to/documents"
Embedding Providers
If you use Ollama, you can use any pulled model that supports embeddings.
Ollama (Default)
VoyageAI
If you want to use VoyageAI embeddings you will need to install haiku.rag
with the VoyageAI extras,
EMBEDDINGS_PROVIDER="voyageai"
EMBEDDINGS_MODEL="voyage-3.5"
EMBEDDINGS_VECTOR_DIM=1024
VOYAGE_API_KEY="your-api-key"
OpenAI
If you want to use OpenAI embeddings you will need to install haiku.rag
with the VoyageAI extras,
and set environment variables.
EMBEDDINGS_PROVIDER="openai"
EMBEDDINGS_MODEL="text-embedding-3-small" # or text-embedding-3-large
EMBEDDINGS_VECTOR_DIM=1536
OPENAI_API_KEY="your-api-key"
Question Answering Providers
Configure which LLM provider to use for question answering.
Ollama (Default)
OpenAI
For OpenAI QA, you need to install haiku.rag with OpenAI extras:
Then configure:
QA_PROVIDER="openai"
QA_MODEL="gpt-4o-mini" # or gpt-4, gpt-3.5-turbo, etc.
OPENAI_API_KEY="your-api-key"
Anthropic
For Anthropic QA, you need to install haiku.rag with Anthropic extras:
Then configure:
QA_PROVIDER="anthropic"
QA_MODEL="claude-3-5-haiku-20241022" # or claude-3-5-sonnet-20241022, etc.
ANTHROPIC_API_KEY="your-api-key"
Reranking
Reranking improves search quality by re-ordering the initial search results using specialized models. When enabled, the system retrieves more candidates (3x the requested limit) and then reranks them to return the most relevant results.
Reranking is automatically enabled by default using Ollama, or if you install the appropriate reranking provider package.
Disabling Reranking
To disable reranking completely for faster searches:
Ollama (Default)
Ollama reranking uses LLMs with structured output to rank documents by relevance:
RERANK_PROVIDER="ollama"
RERANK_MODEL="qwen3:1.7b" # or any model that supports structured output
OLLAMA_BASE_URL="http://localhost:11434"
MixedBread AI
For MxBAI reranking, install with mxbai extras:
Then configure:
Cohere
For Cohere reranking, install with Cohere extras:
Then configure:
Other Settings
Database and Storage
Document Processing
# Chunk size for document processing
CHUNK_SIZE=256
# Number of adjacent chunks to include before/after retrieved chunks for context
# 0 = no expansion (default), 1 = include 1 chunk before and after, etc.
# When expanded chunks overlap or are adjacent, they are automatically merged
# into single chunks with continuous content to eliminate duplication
CONTEXT_CHUNK_RADIUS=0