Skip to content

QA and Research Configuration

Question Answering Configuration

Configure the QA workflow:

qa:
  model:
    provider: ollama
    name: gpt-oss
    enable_thinking: false
  max_sub_questions: 3  # Maximum sub-questions for deep QA
  max_iterations: 2     # Maximum search iterations per sub-question
  max_concurrency: 1    # Sub-questions processed in parallel
  • model: LLM configuration (see Providers)
  • max_sub_questions: For deep QA mode, maximum number of sub-questions to generate (default: 3)
  • max_iterations: Maximum search/evaluate cycles per sub-question (default: 2)
  • max_concurrency: Number of sub-questions to process in parallel (default: 1)

Deep QA mode (haiku-rag ask --deep) decomposes complex questions into sub-questions, processes them in parallel batches, and synthesizes the results.

Research Configuration

Configure the multi-agent research workflow:

research:
  model:
    provider: ""            # Empty to use qa settings
    name: ""               # Empty to use qa model
    enable_thinking: false
  max_iterations: 3
  confidence_threshold: 0.8
  max_concurrency: 1
  • model: LLM configuration. Leave provider/model empty to inherit from qa (see Providers)
  • max_iterations: Maximum search/evaluate cycles (default: 3)
  • confidence_threshold: Stop when confidence score meets/exceeds this (default: 0.8)
  • max_concurrency: Sub-questions searched in parallel per iteration (default: 1)

The research workflow plans sub-questions, searches in parallel batches, evaluates findings, and iterates until reaching the confidence threshold or max iterations.

AG-UI Server Configuration

Configure the AG-UI HTTP server for streaming graph execution events:

agui:
  host: "0.0.0.0"
  port: 8000
  cors_origins: ["*"]
  cors_credentials: true
  cors_methods: ["GET", "POST", "OPTIONS"]
  cors_headers: ["*"]

Start the AG-UI server with:

haiku-rag serve --agui

The server exposes: - GET /health - Health check endpoint - POST /v1/agent/stream - Research graph streaming endpoint (Server-Sent Events)

See Server Mode for more details.