Changelog

Unreleased

0.18.0 - 2026-06-29

Added

Skill.force_final_answer (default True): when a skill run exhausts its request_limit, run_skill makes one final tool-less completion seeded with the gathered message history instead of raising UsageLimitExceeded. Set False to restore the raising behavior.

0.17.2 - 2026-06-05

Changed

Skill runs use a retry budget of 3 for both tool calls and output validation (retries=3, was the pydantic-ai default of 1), so a tool raising ModelRetry (or an output failing validation) gets more attempts before the run fails with UnexpectedModelBehavior.

0.17.1 - 2026-05-21

Fixed

Use FunctionToolResultEvent.part instead of the deprecated .result alias when forwarding sub-agent tool events through _events_to_activity. Removes a DeprecationWarning at runtime and survives the 2.0 removal of .result.

0.17.0 - 2026-05-21

Changed

Bump pydantic-ai-slim>=1.96.0. Add the ag-ui extra so the AG-UI dependency is explicit, and migrate AGUIAdapter imports from pydantic_ai.ag_ui to pydantic_ai.ui.ag_ui (the legacy module was deprecated upstream).
Bump pydantic-monty>=0.0.17 in the code-execution skill. Migrate off the deprecated module-level pydantic_monty.run_repl_async(repl, ...) helper to the repl.feed_run_async(...) method.
Bump pydantic-ai-backend[docker,console]>=0.2.7 in the sandbox skill. Picks up async-safe console toolset (0.2.3), DockerSandbox lifecycle / container_name / SessionManager improvements (0.2.4), globstar and path-matching fixes (0.2.5–0.2.6), and async-cancellable shell execution (0.2.7). ConsoleToolset and DockerSandbox import paths and kwargs are unchanged.
Refactor the sandbox skill to use pydantic_ai_backends.SessionManager for container lifecycle. Replaces the hand-rolled module-level _sandboxes / _last_active / _timeouts dicts and _cleanup_stale / _get_sandbox helpers; idle cleanup, dead-container replacement, and at-exit shutdown now go through the upstream manager.
Refresh the rest of the lockfile to latest within current constraints (ag-ui-protocol, textual, python-dotenv, ty, pytest-cov, ruff, and transitives). The ty bump from 0.0.18 to 0.0.28 surfaces stricter narrowing across closures and recognises only # ty: ignore[CODE] (not the mypy-style # type: ignore[CODE]); affected sites were rewritten to capture skill.path in a local variable after the assert so the narrowing reaches the closure, or migrated to the ty: ignore syntax where the diagnostic is a known false positive (sigstore ExtensionType API, pydantic frozen-model assignment in negative tests).
Bump pydantic-ai-slim>=1.100.0 (the last pre-2.0 release) and migrate off the remaining APIs slated for removal in 2.0:
agent.run(event_stream_handler=...) → Agent(..., capabilities=[ProcessEventStream(handler)]). The capability replaces the deprecated kwarg-based event-stream plumbing in haiku.skills.agent.run_skill.
pydantic_ai.mcp.MCPServer{Stdio,SSE,StreamableHTTP} → pydantic_ai.mcp.MCPToolset (FastMCP-backed, full MCP protocol). haiku.skills.mcp.skill_from_mcp now accepts an AbstractToolset[Any] instead of MCPServer, so it works with both the new MCPToolset and any future MCP toolset implementation.
Thread a conversation_id through the agent runs so multi-turn chat sessions and their sub-agent dispatches land in one OTel/Logfire trace. The chat TUI passes run_input.thread_id to AGUIAdapter.run_stream(conversation_id=...); run_skill accepts an optional conversation_id= and forwards it to Agent.run(...); the execute_skill tool propagates ctx.conversation_id so sub-agent traces link to the parent.

0.16.0 - 2026-04-28

Added

request_limit on Skill: Optional integer field that overrides the default 20-request usage limit when running a skill via run_skill(). Long-running skills (e.g. analysis skills that interleave search and code execution) can raise their ceiling without monkey-patching.
run_skill is now public: Re-exported from haiku.skills. Replaces the previously underscore-prefixed _run_skill.

Changed

_run_skill renamed to run_skill: Breaking. Update imports from from haiku.skills.agent import _run_skill to from haiku.skills.agent import run_skill (or from haiku.skills import run_skill).

0.15.1 - 2026-04-23

Added

haiku-skills chat --initial-state-path PATH: Seed the AG-UI state from a YAML or JSON file at launch. Values are deep-merged into each namespace's defaults, so partial overrides are supported. The seeded state is preserved when clearing the chat.
Editable state modal in the chat TUI: Open via the system command palette ("Edit state"). JSON is editable in place with syntax highlighting; ctrl+s validates the edit through each namespace's Pydantic model and applies it to the next turn, ctrl+c copies the current selection to the clipboard (useful for saving a session to reuse with --initial-state-path), escape cancels. The command is hidden while a turn is in flight.

0.15.0 - 2026-04-22

Added

lifespan on Skill: Optional async context manager factory called once per skill invocation (one _run_skill() call, i.e. one sub-agent run). The factory receives the skill's deps; use it to set up and tear down per-invocation resources (e.g. a database client opened once and reused across tool calls, a counter scoped to the invocation). Pair with deps_type= to give tools typed access via ctx.deps.<field>. Strictly additive and opt-in — skills without a lifespan are unaffected. Not wired for direct-tool mode (SkillToolset with use_subagents=False) — that path has no well-defined invocation boundary.

Changed

Bundled code-execution and sandbox skills migrated to lifespan: haiku-skills-code-execution now uses a MontyRepl populated by a lifespan, so variables and definitions declared in one run_code call persist into subsequent run_code calls within the same sub-agent invocation. haiku-skills-sandbox replaces its __post_init__ deps setup with a named sandbox_lifespan; session-scoped container persistence (via SandboxState.session_id) is unchanged.

0.14.0 - 2026-04-16

Changed

run_agui_stream() signature: toolset is now optional and keyword-only. The adapter argument moves to first position. Callers should update from run_agui_stream(toolset, adapter) to run_agui_stream(adapter, toolset=toolset). When toolset is None, the stream yields adapter events without skill event merging.

0.13.3 - 2026-04-14

0.13.2 - 2026-04-08

Fixed

Skill.reconfigure() now copies instructions and resources: Previously, reconfigure() silently dropped instructions and resources from the factory-produced skill, causing factory-generated instructions (e.g. config-dependent preambles) to be lost after reconfiguration.

0.13.1 - 2026-04-07

Fixed

ActivitySnapshotEvent timestamps: Events now carry a millisecond timestamp set at creation time. Previously, events were emitted with timestamp=None and stamped downstream in a batch, causing all events from a skill sub-agent run to share the same timestamp. Events are also now converted eagerly as they arrive rather than batch-converted after the skill finishes.

0.13.0 - 2026-03-27

Added

SkillRunDepsProtocol: New @runtime_checkable Protocol that formalizes the contract for skill sub-agent deps (state + emit). The default SkillRunDeps dataclass satisfies it, and so does any subclass.
deps_type on Skill: Skills can declare a deps_type — any class satisfying SkillRunDepsProtocol. When set, the sub-agent is created with deps_type(state=state, emit=emit) instead of the default SkillRunDeps. This enables skills to integrate external toolsets that require additional context on the deps object.
Sandbox skill (haiku-skills-sandbox): New skill package for Docker-based Python execution via pydantic-ai-backend. Runs code in an isolated container with pre-installed data science packages (pandas, numpy, scipy, matplotlib) and host filesystem access. Features idle timeout with automatic container cleanup, session-bound containers via AG-UI state, and configurable workspace mounting.

0.12.0 - 2026-03-27

Added

SkillsCapability: New pydantic-ai capability wrapping SkillToolset + system prompt. Provides a single-line integration path via Agent(capabilities=[SkillsCapability(...)]). SkillToolset remains available for advanced use cases.
Skill thinking configuration: Skills can specify thinking effort level (True, 'low', 'medium', 'high', etc.) to configure reasoning on their sub-agents. Supported across providers via pydantic-ai's unified thinking setting.
Skill extras: Skills can carry arbitrary non-tool data via extras: dict[str, Any]. Useful for exposing utility functions or other resources that the consuming app needs but that aren't agent tools.

Fixed

Script timeout: run_script now enforces a timeout (default 120s, configurable via HAIKU_SKILLS_SCRIPT_TIMEOUT env var). Previously a hanging script would block the agent forever.

Changed

Bump pydantic-ai dependency from >=1.63.0 to >=1.71.0.
AG-UI state restoration now uses pydantic-ai's for_run() hook instead of overriding get_tools().

0.11.0 - 2026-03-26

Added

Skill reconfiguration: Entrypoint skills can be reconfigured after discovery via skill.reconfigure(**kwargs). The stored factory is re-invoked with the given arguments, replacing tools, state, and model while preserving metadata and identity. This allows consuming apps to override factory parameters (e.g. config, database path) without bypassing entry point discovery.

0.10.0 - 2026-03-24

Added

Optional sub-agent delegation: SkillToolset(use_subagents=False) exposes skill tools directly to the main agent via query_skill, execute_skill_tool, run_skill_script, and read_skill_resource — bypassing sub-agent LLM loops for lower latency and cost. Default (use_subagents=True) preserves existing behavior.
--no-subagents CLI flag: haiku-skills chat --no-subagents runs the TUI in direct mode.
Comprehensive integration tests: VCR-recorded tests exercising all tool types (execute_skill, query_skill, execute_skill_tool, read_skill_resource, run_skill_script) across both execution modes (subagent/direct) and skill sources (entrypoint/filesystem), with AG-UI event and state assertions.

Changed

execute_skill_tool returns raw values: Tool results are passed through as-is instead of being JSON-serialized, consistent with pydantic-ai's ToolReturnContent support.

Fixed

Activity snapshot message_id now stable: Result snapshots share the same message_id as their corresponding call snapshot, so AG-UI frontends update activities in place instead of showing duplicates. Call snapshots use replace=False (create), result snapshots use replace=True (update).
Chat TUI preserves full message history: Tool calls and their results are now retained across turns via pydantic-ai message history, so the LLM no longer re-invokes tools for information it already retrieved.

0.9.1 - 2026-03-20

0.9.0 - 2026-03-20

Changed

Spec-compliant skill directory layout: Scripts now live alongside SKILL.md (e.g. web/scripts/search.py) instead of in a separate package-level scripts/ dir
Skill directory renames: Renamed code-execution → codeexecution, image-generation → imagegeneration (skill dirs are now Python packages, which require valid identifiers)
Named CLI flags for scripts: All scripts use argparse with --flag value syntax and support --help. script_tools.py passes named args instead of positional
Gmail extracted into standalone scripts: Auth, helpers, and all 8 operations (search, read, send, reply, draft, list drafts, modify labels, list labels) are now standalone scripts with argparse CLI interfaces. __init__.py is a thin wrapper with state tracking
SKILL.md script documentation: All SKILL.md files now document available scripts with CLI flags and descriptions
CI signature verification: validate-skills workflow now verifies skill signatures (integrity-only)
Documentation reorganized: Replaced quickstart.md, skill-sources.md, and examples.md with a single progressive tutorial.md. Cleaned skills.md into a pure reference page. Removed duplicated state section from ag-ui.md

Added

Skill signing and verification: Identity-based signing via sigstore. Sign skills with sign_skill(), verify with TrustedIdentity on registry/discovery. Install with uv pip install "haiku.skills[signing]"
haiku-skills sign command: Sign a skill directory via CLI with browser-based OIDC or ambient CI credentials
haiku-skills verify command: Verify a signed skill against trusted identities (--identity/--issuer) or check cryptographic integrity only (--unsafe)

0.8.1 - 2026-03-17

Added

Custom event emission from skill tools: SkillRunDeps now has an emit callback that skill tools can use to emit AG-UI BaseEvent subclasses (e.g. CustomEvent) during execution. Events are flushed through the event sink at tool-call boundaries (real-time path) or returned in ToolReturn.metadata (batched path).

Changed

code-execution skill: Rewritten from sync fd-dup hack to async run_monty_async, exposing await llm(prompt) as an external function so sandbox code can make one-shot LLM calls for per-item reasoning (classify, summarize, extract) in loops
Gmail skill (haiku-skills-gmail): Search, read, send, reply, draft, and label Gmail emails via the Google Gmail API with OAuth2 authentication
Notifications skill (haiku-skills-notifications): Send and receive push notifications via ntfy.sh — with send_notification and read_notifications tools, per-skill state tracking, self-hosted server support, and optional bearer token authentication

Removed

Graphiti memory skill (haiku-skills-graphiti-memory): Removed the knowledge graph memory skill and all associated code, tests, and configuration

0.8.0 - 2026-03-13

Changed

code-execution skill: Updated pydantic-monty to >=0.0.8, rewritten SKILL.md sandbox limitations to reflect new capabilities (math, re, os.environ, getattr, dataclass methods, PEP 448 unpacking)
Sub-agent tool events emitted as ActivitySnapshotEvent instead of ToolCall* events, fixing AG-UI history replay crashes in conforming clients (CopilotKit/soliplex)

0.7.5 - 2026-03-12

Fixed

_events_to_agui crash on RetryPromptPart: Handle RetryPromptPart results in FunctionToolResultEvent by calling .model_response() instead of .model_response_str() which doesn't exist on retry parts (#35)

0.7.4 - 2026-03-06

Changed

Main agent prompt: Emphasize that skills are isolated agents with no shared context — the main agent must include concrete data when chaining skills and must synthesize skill responses for the user

0.7.3 - 2026-03-06

0.7.2 - 2026-03-06

Fixed

Missing openai extra in core dependency: pydantic-ai-slim[mcp] → pydantic-ai-slim[mcp,openai] — most users hit ImportError: Please install openai on first use
CLI unusable without [tui] extra: typer and python-dotenv are now lazy-loaded with a friendly error message instead of crashing with ModuleNotFoundError

0.7.1 - 2026-03-06

Added

Independent skill package publishing: Skill packages (haiku-skills-web, etc.) can now be published to PyPI independently from the core package using skills-v* release tags (#27)
Bump script updates skill packages: bump_version.py now updates version and haiku.skills>= dependency constraint in all skills/*/pyproject.toml files
Skill package PyPI metadata: All 4 skill packages now include authors, license, readme, keywords, classifiers, and project URLs
Skill package READMEs: haiku-skills-web, haiku-skills-image-generation, and haiku-skills-code-execution now have READMEs with prerequisites, configuration, tools, and installation instructions

Fixed

Missing core dependencies: ag-ui-protocol and jsonpatch moved from optional [ag-ui] extra to core dependencies — a clean install of haiku.skills no longer fails with ModuleNotFoundError: No module named 'ag_ui'
graphiti-memory recall returns empty results: Switch recall() and forget() from client.search() to client.search_() with BM25 + cosine + BFS graph traversal, RRF reranking, and sim_min_score=0.0 so cosine always returns candidates for BFS to expand on
graphiti-memory cross-encoder crash: _build_cross_encoder() now passes an AsyncOpenAI client directly to OpenAIRerankerClient instead of the graphiti OpenAIGenericClient wrapper, which lacked the .chat attribute the reranker needs

Changed

generate_image returns file path: The image generation tool now returns the file path directly instead of a markdown image reference
Main agent prompt: Instructs the agent to present skill results exactly as returned, without fabricating or rewriting content

0.7.0 - 2026-03-04

Changed

discover_from_paths collects all validation errors: Returns tuple[list[Skill], list[SkillValidationError]] instead of raising on the first broken skill — valid skills are still loaded while errors are collected (#25)
SkillRegistry.discover returns errors: Returns list[SkillValidationError] instead of None, propagating errors from discover_from_paths
CLI prints discovery warnings: list and chat commands print validation errors as warnings to stderr instead of aborting

Added

SkillValidationError: ValueError subclass with a .path attribute, exported from haiku.skills
StateMetadata: Frozen dataclass with namespace, type, and schema fields, exported from haiku.skills
Skill.state_metadata(): Returns a StateMetadata for skills that declare state; None otherwise

0.6.0 - 2026-03-03

Added

Real-time sub-agent event streaming: run_agui_stream() merges main-agent and sub-agent AG-UI events into a single stream, so sub-agent tool calls (search, fetch, etc.) appear in real-time instead of batching until execute_skill returns

Changed

Sub-agent output: _run_skill now returns the model's final response (result.output) instead of the last tool's raw return value — state and structured data are already handled via the snapshot/delta mechanism
Event sink on SkillToolset: _run_skill accepts an optional event_sink callback; when active, sub-agent tool events stream through the sink immediately rather than collecting in batch
SkillRunDeps simplified: Removed _collected_events field — event collection is now closure-based inside _run_skill

0.5.2 - 2026-03-02

Added

Graphiti memory skill (haiku-skills-graphiti-memory): Store, recall, and forget memories using a knowledge graph powered by Graphiti and FalkorDB — with per-skill state tracking

Changed

SkillMetadata.allowed_tools accepts strings: Now accepts both str (space-separated) and list[str] as input, always stores list[str] — eliminates conversion overhead for consumers using the spec's string format (#19)
Skill.model accepts Model instances: Widened from str | None to str | Model | None so consumers can pass configured model objects directly (#20)
discover_from_paths accepts single-skill directories: Paths that contain SKILL.md directly are now treated as skill directories, in addition to parent directories containing skill subdirectories. Dot-directories are skipped during child iteration.

Fixed

Ollama base URL handling: resolve_model() now appends /v1 to OLLAMA_BASE_URL instead of expecting it in the env var, consistent with Ollama's convention
Web skill fetch_page for non-HTML content: Pages with non-HTML content types (e.g. plain text, markdown) are now returned directly instead of failing with "could not extract content"

0.5.1 - 2026-02-27

Added

build_system_prompt() utility: Standalone function to build the main agent system prompt from a skill catalog, with optional custom preamble — replaces SkillToolset.system_prompt property

Changed

Entrypoint skill priority: Skills passed via skills= now take priority over entrypoint-discovered skills — entrypoints with the same name are silently skipped instead of raising a duplicate error
Sub-agent request limit: Increased from 10 to 20 to allow skills with more complex tool chains to complete
Chat TUI tool call display: Tool call widgets now stream argument updates and show richer descriptions (e.g. execute_skill → web: search for ...)

Removed

SkillToolset.system_prompt: Use build_system_prompt(toolset.skill_catalog) instead

0.5.0 - 2026-02-25

Added

skill_model parameter: SkillToolset accepts skill_model to set the model for skill sub-agents (also available as --skill-model CLI option)
resolve_model(): Resolves model strings with transparent ollama: prefix handling (defaults to http://127.0.0.1:11434 when OLLAMA_BASE_URL is unset)
run_script tool: Skill sub-agents can execute scripts from the skill's scripts/ directory via a run_script tool, supporting .py, .sh, .js, .ts, and generic executables with path validation
JS/TS script support: run_script dispatches .js files via node and .ts files via npx tsx; extensible via SCRIPT_RUNNERS mapping

Changed

Script tool execution: Scripts are now invoked with CLI positional arguments (sys.argv + print()) instead of JSON on stdin/stdout, matching standard CLI conventions and enabling compatibility with external skill scripts
Resilient script discovery: discover_script_tools() now skips scripts without a main() function (with a warning) instead of crashing

Fixed

Script failure error reporting: Script error messages now include stdout when stderr is empty, so usage messages and other stdout-based errors are visible to the sub-agent
Script sibling imports: run_script and typed script tools now set PYTHONPATH to the skill directory so scripts can use package-style imports (e.g. from scripts.utils import ...)

0.4.2 - 2026-02-20

Added

SkillDeps: Minimal dataclass satisfying pydantic-ai's StateHandler protocol for type-correct AG-UI state round-tripping (replaces StateDeps[dict[str, Any]] recommendation in docs)

0.4.1 - 2026-02-20

Fixed

AG-UI state restoration: SkillToolset now restores skill namespace state from frontend-provided deps.state on each AG-UI request, so state survives server restarts

Removed

RAG skill package (haiku-skills-rag): Moved to haiku.rag

0.4.0 - 2026-02-19

Added

haiku-skills validate command: Validate skill directories against the Agent Skills specification using skills-ref
Unknown frontmatter rejection: SkillMetadata now rejects unknown fields (extra="forbid")
skills-ref dependency: Reference implementation used for spec-compliant validation

Changed

Distributable skill directory layout: SKILL.md moved into a subdirectory matching the skill name (e.g. haiku_skills_web/web/SKILL.md) so all bundled skills pass directory-name validation

0.3.0 - 2026-02-19

Added

haiku-skills list command: List discovered skills with name and description, supports -s/--skill-path and --use-entrypoints
--skill / -k option for chat: Filter which skills to activate by name (repeatable)
RAG skill package (haiku-skills-rag): Search, retrieve and analyze documents via haiku.rag with tools for hybrid search, document listing/retrieval, QA with citations, and code-execution analysis
Web skill package (haiku-skills-web): Web search via Brave Search API and page content extraction via trafilatura (replaces haiku-skills-brave-search)
Per-skill state: Skills can declare a state_type (Pydantic BaseModel) and state_namespace; state is passed to tool functions via RunContext[SkillRunDeps] and tracked per namespace on the toolset
AG-UI protocol: SkillToolset emits StateDeltaEvent (JSON Patch) when skill execution changes state, compatible with the AG-UI protocol
State API on SkillToolset: build_state_snapshot(), restore_state_snapshot(), get_namespace(), state_schemas
In-process tools with state: Distributable skills (web, image-generation, code-execution, rag) converted from script-based to in-process tool functions that can read and write per-skill state

Changed

Skills fully loaded at discovery: Instructions, script tools, and resources are loaded when skills are discovered, removing the separate activation step
Chat TUI rewritten as AG-UI client: Uses AGUIAdapter event stream instead of polling; inline state delta display and a "View state" modal via the command palette
Skill name validation: Now accepts unicode lowercase alphanumeric characters per the Agent Skills specification (previously ASCII-only)
Documentation site: Published at ggozad.github.io/haiku.skills with MkDocs Material

Removed

Brave Search skill package (haiku-skills-brave-search): Replaced by haiku-skills-web
SkillRegistry.activate(): Skills are fully loaded at discovery time; progressive disclosure removed
Task / TaskStatus: Task tracking removed from SkillToolset; the AG-UI adapter provides tool call progress via events

0.1.0 - 2026-02-16

Added

Core framework: Skill-powered AI agents implementing the Agent Skills specification with pydantic-ai
Skill model: Pydantic v2 models for skills, metadata, and tasks with full validation
SKILL.md parser: YAML frontmatter + markdown body parsing following the Agent Skills spec
Skill discovery: Filesystem scanning (directories containing SKILL.md) and Python entrypoint-based plugin discovery
SkillRegistry: Central registry for skill discovery, loading, lookup, and activation
Progressive disclosure: Three-level progressive disclosure — metadata at startup, instructions on activation, resources on demand
Sub-agent delegation: Each skill runs in a focused sub-agent with its own system prompt and tools via execute_skill
SkillToolset: FunctionToolset integration that exposes skills as tools for any pydantic-ai Agent
Script tools: Python scripts in scripts/ with main() function get AST-parsed into typed pydantic-ai Tool objects with automatic parameter schema extraction
Resource reading: Skills can expose files (references, assets, templates) as resources; sub-agents read them on demand via read_resource tool with path validation and traversal defense
MCP integration: skill_from_mcp() maps MCP servers directly to skills
Chat TUI: Terminal-based chat interface using Textual
Distributable skill packages: Workspace members for brave-search, image-generation, and code-execution skills