request_limit on Skill: Optional integer field that overrides the default 20-request usage limit when running a skill via run_skill(). Long-running skills (e.g. analysis skills that interleave search and code execution) can raise their ceiling without monkey-patching.
run_skill is now public: Re-exported from haiku.skills. Replaces the previously underscore-prefixed _run_skill.
Changed
_run_skill renamed to run_skill: Breaking. Update imports from from haiku.skills.agent import _run_skill to from haiku.skills.agent import run_skill (or from haiku.skills import run_skill).
haiku-skills chat --initial-state-path PATH: Seed the AG-UI state from a YAML or JSON file at launch. Values are deep-merged into each namespace's defaults, so partial overrides are supported. The seeded state is preserved when clearing the chat.
Editable state modal in the chat TUI: Open via the system command palette ("Edit state"). JSON is editable in place with syntax highlighting; ctrl+s validates the edit through each namespace's Pydantic model and applies it to the next turn, ctrl+c copies the current selection to the clipboard (useful for saving a session to reuse with --initial-state-path), escape cancels. The command is hidden while a turn is in flight.
lifespan on Skill: Optional async context manager factory called once per skill invocation (one _run_skill() call, i.e. one sub-agent run). The factory receives the skill's deps; use it to set up and tear down per-invocation resources (e.g. a database client opened once and reused across tool calls, a counter scoped to the invocation). Pair with deps_type= to give tools typed access via ctx.deps.<field>. Strictly additive and opt-in — skills without a lifespan are unaffected. Not wired for direct-tool mode (SkillToolset with use_subagents=False) — that path has no well-defined invocation boundary.
Changed
Bundled code-execution and sandbox skills migrated to lifespan: haiku-skills-code-execution now uses a MontyRepl populated by a lifespan, so variables and definitions declared in one run_code call persist into subsequent run_code calls within the same sub-agent invocation. haiku-skills-sandbox replaces its __post_init__ deps setup with a named sandbox_lifespan; session-scoped container persistence (via SandboxState.session_id) is unchanged.
run_agui_stream() signature: toolset is now optional and keyword-only. The adapter argument moves to first position. Callers should update from run_agui_stream(toolset, adapter) to run_agui_stream(adapter, toolset=toolset). When toolset is None, the stream yields adapter events without skill event merging.
Skill.reconfigure() now copies instructions and resources: Previously, reconfigure() silently dropped instructions and resources from the factory-produced skill, causing factory-generated instructions (e.g. config-dependent preambles) to be lost after reconfiguration.
ActivitySnapshotEvent timestamps: Events now carry a millisecond timestamp set at creation time. Previously, events were emitted with timestamp=None and stamped downstream in a batch, causing all events from a skill sub-agent run to share the same timestamp. Events are also now converted eagerly as they arrive rather than batch-converted after the skill finishes.
SkillRunDepsProtocol: New @runtime_checkable Protocol that formalizes the contract for skill sub-agent deps (state + emit). The default SkillRunDeps dataclass satisfies it, and so does any subclass.
deps_type on Skill: Skills can declare a deps_type — any class satisfying SkillRunDepsProtocol. When set, the sub-agent is created with deps_type(state=state, emit=emit) instead of the default SkillRunDeps. This enables skills to integrate external toolsets that require additional context on the deps object.
Sandbox skill (haiku-skills-sandbox): New skill package for Docker-based Python execution via pydantic-ai-backend. Runs code in an isolated container with pre-installed data science packages (pandas, numpy, scipy, matplotlib) and host filesystem access. Features idle timeout with automatic container cleanup, session-bound containers via AG-UI state, and configurable workspace mounting.
SkillsCapability: New pydantic-ai capability wrapping SkillToolset + system prompt. Provides a single-line integration path via Agent(capabilities=[SkillsCapability(...)]). SkillToolset remains available for advanced use cases.
Skill thinking configuration: Skills can specify thinking effort level (True, 'low', 'medium', 'high', etc.) to configure reasoning on their sub-agents. Supported across providers via pydantic-ai's unified thinking setting.
Skill extras: Skills can carry arbitrary non-tool data via extras: dict[str, Any]. Useful for exposing utility functions or other resources that the consuming app needs but that aren't agent tools.
Fixed
Script timeout: run_script now enforces a timeout (default 120s, configurable via HAIKU_SKILLS_SCRIPT_TIMEOUT env var). Previously a hanging script would block the agent forever.
Changed
Bump pydantic-ai dependency from >=1.63.0 to >=1.71.0.
AG-UI state restoration now uses pydantic-ai's for_run() hook instead of overriding get_tools().
Skill reconfiguration: Entrypoint skills can be reconfigured after discovery via skill.reconfigure(**kwargs). The stored factory is re-invoked with the given arguments, replacing tools, state, and model while preserving metadata and identity. This allows consuming apps to override factory parameters (e.g. config, database path) without bypassing entry point discovery.
Optional sub-agent delegation: SkillToolset(use_subagents=False) exposes skill tools directly to the main agent via query_skill, execute_skill_tool, run_skill_script, and read_skill_resource — bypassing sub-agent LLM loops for lower latency and cost. Default (use_subagents=True) preserves existing behavior.
--no-subagents CLI flag: haiku-skills chat --no-subagents runs the TUI in direct mode.
Comprehensive integration tests: VCR-recorded tests exercising all tool types (execute_skill, query_skill, execute_skill_tool, read_skill_resource, run_skill_script) across both execution modes (subagent/direct) and skill sources (entrypoint/filesystem), with AG-UI event and state assertions.
Changed
execute_skill_tool returns raw values: Tool results are passed through as-is instead of being JSON-serialized, consistent with pydantic-ai's ToolReturnContent support.
Fixed
Activity snapshot message_id now stable: Result snapshots share the same message_id as their corresponding call snapshot, so AG-UI frontends update activities in place instead of showing duplicates. Call snapshots use replace=False (create), result snapshots use replace=True (update).
Chat TUI preserves full message history: Tool calls and their results are now retained across turns via pydantic-ai message history, so the LLM no longer re-invokes tools for information it already retrieved.
Spec-compliant skill directory layout: Scripts now live alongside SKILL.md (e.g. web/scripts/search.py) instead of in a separate package-level scripts/ dir
Skill directory renames: Renamed code-execution → codeexecution, image-generation → imagegeneration (skill dirs are now Python packages, which require valid identifiers)
Named CLI flags for scripts: All scripts use argparse with --flag value syntax and support --help. script_tools.py passes named args instead of positional
Gmail extracted into standalone scripts: Auth, helpers, and all 8 operations (search, read, send, reply, draft, list drafts, modify labels, list labels) are now standalone scripts with argparse CLI interfaces. __init__.py is a thin wrapper with state tracking
SKILL.md script documentation: All SKILL.md files now document available scripts with CLI flags and descriptions
CI signature verification: validate-skills workflow now verifies skill signatures (integrity-only)
Documentation reorganized: Replaced quickstart.md, skill-sources.md, and examples.md with a single progressive tutorial.md. Cleaned skills.md into a pure reference page. Removed duplicated state section from ag-ui.md
Added
Skill signing and verification: Identity-based signing via sigstore. Sign skills with sign_skill(), verify with TrustedIdentity on registry/discovery. Install with uv pip install "haiku.skills[signing]"
haiku-skills sign command: Sign a skill directory via CLI with browser-based OIDC or ambient CI credentials
haiku-skills verify command: Verify a signed skill against trusted identities (--identity/--issuer) or check cryptographic integrity only (--unsafe)
Custom event emission from skill tools: SkillRunDeps now has an emit callback that skill tools can use to emit AG-UI BaseEvent subclasses (e.g. CustomEvent) during execution. Events are flushed through the event sink at tool-call boundaries (real-time path) or returned in ToolReturn.metadata (batched path).
Changed
code-execution skill: Rewritten from sync fd-dup hack to async run_monty_async, exposing await llm(prompt) as an external function so sandbox code can make one-shot LLM calls for per-item reasoning (classify, summarize, extract) in loops
Gmail skill (haiku-skills-gmail): Search, read, send, reply, draft, and label Gmail emails via the Google Gmail API with OAuth2 authentication
Notifications skill (haiku-skills-notifications): Send and receive push notifications via ntfy.sh — with send_notification and read_notifications tools, per-skill state tracking, self-hosted server support, and optional bearer token authentication
Removed
Graphiti memory skill (haiku-skills-graphiti-memory): Removed the knowledge graph memory skill and all associated code, tests, and configuration
code-execution skill: Updated pydantic-monty to >=0.0.8, rewritten SKILL.md sandbox limitations to reflect new capabilities (math, re, os.environ, getattr, dataclass methods, PEP 448 unpacking)
Sub-agent tool events emitted as ActivitySnapshotEvent instead of ToolCall* events, fixing AG-UI history replay crashes in conforming clients (CopilotKit/soliplex)
_events_to_agui crash on RetryPromptPart: Handle RetryPromptPart results in FunctionToolResultEvent by calling .model_response() instead of .model_response_str() which doesn't exist on retry parts (#35)
Main agent prompt: Emphasize that skills are isolated agents with no shared context — the main agent must include concrete data when chaining skills and must synthesize skill responses for the user
Missing openai extra in core dependency: pydantic-ai-slim[mcp] → pydantic-ai-slim[mcp,openai] — most users hit ImportError: Please install openai on first use
CLI unusable without [tui] extra: typer and python-dotenv are now lazy-loaded with a friendly error message instead of crashing with ModuleNotFoundError
Independent skill package publishing: Skill packages (haiku-skills-web, etc.) can now be published to PyPI independently from the core package using skills-v* release tags (#27)
Bump script updates skill packages: bump_version.py now updates version and haiku.skills>= dependency constraint in all skills/*/pyproject.toml files
Skill package PyPI metadata: All 4 skill packages now include authors, license, readme, keywords, classifiers, and project URLs
Skill package READMEs: haiku-skills-web, haiku-skills-image-generation, and haiku-skills-code-execution now have READMEs with prerequisites, configuration, tools, and installation instructions
Fixed
Missing core dependencies: ag-ui-protocol and jsonpatch moved from optional [ag-ui] extra to core dependencies — a clean install of haiku.skills no longer fails with ModuleNotFoundError: No module named 'ag_ui'
graphiti-memory recall returns empty results: Switch recall() and forget() from client.search() to client.search_() with BM25 + cosine + BFS graph traversal, RRF reranking, and sim_min_score=0.0 so cosine always returns candidates for BFS to expand on
graphiti-memory cross-encoder crash: _build_cross_encoder() now passes an AsyncOpenAI client directly to OpenAIRerankerClient instead of the graphiti OpenAIGenericClient wrapper, which lacked the .chat attribute the reranker needs
Changed
generate_image returns file path: The image generation tool now returns the file path directly instead of a markdown image reference
Main agent prompt: Instructs the agent to present skill results exactly as returned, without fabricating or rewriting content
discover_from_paths collects all validation errors: Returns tuple[list[Skill], list[SkillValidationError]] instead of raising on the first broken skill — valid skills are still loaded while errors are collected (#25)
SkillRegistry.discover returns errors: Returns list[SkillValidationError] instead of None, propagating errors from discover_from_paths
CLI prints discovery warnings: list and chat commands print validation errors as warnings to stderr instead of aborting
Added
SkillValidationError: ValueError subclass with a .path attribute, exported from haiku.skills
StateMetadata: Frozen dataclass with namespace, type, and schema fields, exported from haiku.skills
Skill.state_metadata(): Returns a StateMetadata for skills that declare state; None otherwise
Real-time sub-agent event streaming: run_agui_stream() merges main-agent and sub-agent AG-UI events into a single stream, so sub-agent tool calls (search, fetch, etc.) appear in real-time instead of batching until execute_skill returns
Changed
Sub-agent output: _run_skill now returns the model's final response (result.output) instead of the last tool's raw return value — state and structured data are already handled via the snapshot/delta mechanism
Event sink on SkillToolset: _run_skill accepts an optional event_sink callback; when active, sub-agent tool events stream through the sink immediately rather than collecting in batch
SkillRunDeps simplified: Removed _collected_events field — event collection is now closure-based inside _run_skill
Graphiti memory skill (haiku-skills-graphiti-memory): Store, recall, and forget memories using a knowledge graph powered by Graphiti and FalkorDB — with per-skill state tracking
Changed
SkillMetadata.allowed_tools accepts strings: Now accepts both str (space-separated) and list[str] as input, always stores list[str] — eliminates conversion overhead for consumers using the spec's string format (#19)
Skill.model accepts Model instances: Widened from str | None to str | Model | None so consumers can pass configured model objects directly (#20)
discover_from_paths accepts single-skill directories: Paths that contain SKILL.md directly are now treated as skill directories, in addition to parent directories containing skill subdirectories. Dot-directories are skipped during child iteration.
Fixed
Ollama base URL handling: resolve_model() now appends /v1 to OLLAMA_BASE_URL instead of expecting it in the env var, consistent with Ollama's convention
Web skill fetch_page for non-HTML content: Pages with non-HTML content types (e.g. plain text, markdown) are now returned directly instead of failing with "could not extract content"
build_system_prompt() utility: Standalone function to build the main agent system prompt from a skill catalog, with optional custom preamble — replaces SkillToolset.system_prompt property
Changed
Entrypoint skill priority: Skills passed via skills= now take priority over entrypoint-discovered skills — entrypoints with the same name are silently skipped instead of raising a duplicate error
Sub-agent request limit: Increased from 10 to 20 to allow skills with more complex tool chains to complete
Chat TUI tool call display: Tool call widgets now stream argument updates and show richer descriptions (e.g. execute_skill → web: search for ...)
Removed
SkillToolset.system_prompt: Use build_system_prompt(toolset.skill_catalog) instead
skill_model parameter: SkillToolset accepts skill_model to set the model for skill sub-agents (also available as --skill-model CLI option)
resolve_model(): Resolves model strings with transparent ollama: prefix handling (defaults to http://127.0.0.1:11434 when OLLAMA_BASE_URL is unset)
run_script tool: Skill sub-agents can execute scripts from the skill's scripts/ directory via a run_script tool, supporting .py, .sh, .js, .ts, and generic executables with path validation
JS/TS script support: run_script dispatches .js files via node and .ts files via npx tsx; extensible via SCRIPT_RUNNERS mapping
Changed
Script tool execution: Scripts are now invoked with CLI positional arguments (sys.argv + print()) instead of JSON on stdin/stdout, matching standard CLI conventions and enabling compatibility with external skill scripts
Resilient script discovery: discover_script_tools() now skips scripts without a main() function (with a warning) instead of crashing
Fixed
Script failure error reporting: Script error messages now include stdout when stderr is empty, so usage messages and other stdout-based errors are visible to the sub-agent
Script sibling imports: run_script and typed script tools now set PYTHONPATH to the skill directory so scripts can use package-style imports (e.g. from scripts.utils import ...)
AG-UI state restoration: SkillToolset now restores skill namespace state from frontend-provided deps.state on each AG-UI request, so state survives server restarts
Removed
RAG skill package (haiku-skills-rag): Moved to haiku.rag
haiku-skills validate command: Validate skill directories against the Agent Skills specification using skills-ref
Unknown frontmatter rejection: SkillMetadata now rejects unknown fields (extra="forbid")
skills-ref dependency: Reference implementation used for spec-compliant validation
Changed
Distributable skill directory layout: SKILL.md moved into a subdirectory matching the skill name (e.g. haiku_skills_web/web/SKILL.md) so all bundled skills pass directory-name validation
haiku-skills list command: List discovered skills with name and description, supports -s/--skill-path and --use-entrypoints
--skill / -k option for chat: Filter which skills to activate by name (repeatable)
RAG skill package (haiku-skills-rag): Search, retrieve and analyze documents via haiku.rag with tools for hybrid search, document listing/retrieval, QA with citations, and code-execution analysis
Web skill package (haiku-skills-web): Web search via Brave Search API and page content extraction via trafilatura (replaces haiku-skills-brave-search)
Per-skill state: Skills can declare a state_type (Pydantic BaseModel) and state_namespace; state is passed to tool functions via RunContext[SkillRunDeps] and tracked per namespace on the toolset
AG-UI protocol: SkillToolset emits StateDeltaEvent (JSON Patch) when skill execution changes state, compatible with the AG-UI protocol
State API on SkillToolset: build_state_snapshot(), restore_state_snapshot(), get_namespace(), state_schemas
In-process tools with state: Distributable skills (web, image-generation, code-execution, rag) converted from script-based to in-process tool functions that can read and write per-skill state
Changed
Skills fully loaded at discovery: Instructions, script tools, and resources are loaded when skills are discovered, removing the separate activation step
Chat TUI rewritten as AG-UI client: Uses AGUIAdapter event stream instead of polling; inline state delta display and a "View state" modal via the command palette
Skill name validation: Now accepts unicode lowercase alphanumeric characters per the Agent Skills specification (previously ASCII-only)
SkillRegistry: Central registry for skill discovery, loading, lookup, and activation
Progressive disclosure: Three-level progressive disclosure — metadata at startup, instructions on activation, resources on demand
Sub-agent delegation: Each skill runs in a focused sub-agent with its own system prompt and tools via execute_skill
SkillToolset: FunctionToolset integration that exposes skills as tools for any pydantic-ai Agent
Script tools: Python scripts in scripts/ with main() function get AST-parsed into typed pydantic-ai Tool objects with automatic parameter schema extraction
Resource reading: Skills can expose files (references, assets, templates) as resources; sub-agents read them on demand via read_resource tool with path validation and traversal defense
MCP integration: skill_from_mcp() maps MCP servers directly to skills
Chat TUI: Terminal-based chat interface using Textual
Distributable skill packages: Workspace members for brave-search, image-generation, and code-execution skills