haiku.rag

haiku.rag is a Retrieval-Augmented Generation (RAG) library built to work on SQLite alone without the need for external vector databases. It uses sqlite-vec for storing the embeddings and performs semantic (vector) search as well as full-text search combined through Reciprocal Rank Fusion. Both open-source (Ollama) as well as commercial (OpenAI, VoyageAI) embedding providers are supported.

Features

Local SQLite: No need to run additional servers
Support for various embedding providers: Ollama, VoyageAI, OpenAI or add your own
Hybrid Search: Vector search using sqlite-vec combined with full-text search FTS5, using Reciprocal Rank Fusion
Question Answering: Built-in QA agents using Ollama or OpenAI.
File monitoring: Automatically index files when run as a server
Extended file format support: Parse 40+ file formats including PDF, DOCX, HTML, Markdown, audio and more. Or add a URL!
MCP server: Exposes functionality as MCP tools
CLI commands: Access all functionality from your terminal
Python client: Call haiku.rag from your own python applications

Quick Start

Install haiku.rag:

uv pip install haiku.rag

Use from Python:

from haiku.rag.client import HaikuRAG

async with HaikuRAG("database.db") as client:
    # Add a document
    doc = await client.create_document("Your content here")

    # Search documents
    results = await client.search("query")

    # Ask questions
    answer = await client.ask("Who is the author of haiku.rag?")

Or use the CLI:

haiku-rag add "Your document content"
haiku-rag search "query"
haiku-rag ask "Who is the author of haiku.rag?"

Documentation

Installation - Install haiku.rag with different providers
Configuration - Environment variables and settings
CLI - Command line interface usage
Question Answering - QA agents and natural language queries
Server - File monitoring and server mode
MCP - Model Context Protocol integration
Python - Python API reference

License

This project is licensed under the MIT License.