SDD Tools Deep Dive¶

A comprehensive walkthrough of the Spec-Driven Development pipeline — from adaptive interview through wave-based autonomous execution. This deep dive covers internal architecture, data flow patterns, execution context sharing, and end-to-end workflow examples that go beyond the SDD Tools reference page.

Plugin: agent-alchemy-sdd-tools | Version: 0.2.0 | Skills: 4 | Agents: 4

Executive Summary¶

The sdd-tools plugin implements a complete Spec-Driven Development (SDD) pipeline for Claude Code. It transforms the development process from ad-hoc prompting into a structured workflow: idea → spec → tasks → execution. The plugin is fully standalone (no external plugin dependencies) and provides 4 skills, 4 agents, and a lifecycle hook, enabling developers to go from a product idea to working code through an automated, verification-driven pipeline.

What is Spec-Driven Development?¶

Spec-Driven Development is a methodology where:

Requirements are captured formally before any code is written
Specifications are structured documents with testable acceptance criteria
Tasks are derived algorithmically from specs, with automatic dependency inference
Implementation is verified against spec-defined acceptance criteria
The spec is the single source of truth throughout the development lifecycle

This contrasts with the typical AI-assisted development pattern where users describe features in natural language and the AI generates code directly — often losing requirements, skipping edge cases, and producing code that's hard to verify.

graph LR
    A["💡 Idea"] --> B["📋 Spec"]
    B --> C["🔍 Analysis"]
    C --> D["✅ Tasks"]
    D --> E["⚡ Execution"]
    E --> F["✔️ Verified Code"]

    style A fill:#7c4dff,color:#fff
    style B fill:#7c4dff,color:#fff
    style C fill:#f44336,color:#fff
    style D fill:#4caf50,color:#fff
    style E fill:#ff9800,color:#fff
    style F fill:#00bcd4,color:#fff

Plugin Architecture¶

Directory Structure¶

sdd-tools/
├── agents/
│   ├── codebase-explorer.md    # Codebase exploration (Sonnet)
│   ├── researcher.md           # External research (Opus)
│   ├── spec-analyzer.md        # Spec quality analysis (Opus)
│   └── task-executor.md        # Task implementation (Opus)
├── hooks/
│   ├── hooks.json              # PreToolUse hook configuration
│   └── auto-approve-session.sh # Session directory auto-approve
├── skills/
│   ├── create-spec/
│   │   ├── SKILL.md            # Interview workflow (660 lines)
│   │   └── references/
│   │       ├── codebase-exploration.md
│   │       ├── interview-questions.md
│   │       ├── recommendation-triggers.md
│   │       ├── recommendation-format.md
│   │       └── templates/
│   │           ├── high-level.md
│   │           ├── detailed.md
│   │           └── full-tech.md
│   ├── analyze-spec/
│   │   ├── SKILL.md            # Analysis workflow
│   │   ├── references/
│   │   │   ├── analysis-criteria.md
│   │   │   ├── common-issues.md
│   │   │   ├── html-review-guide.md
│   │   │   └── report-template.md
│   │   └── templates/
│   │       └── review-template.html
│   ├── create-tasks/
│   │   ├── SKILL.md            # Task decomposition (653 lines)
│   │   └── references/
│   │       ├── decomposition-patterns.md
│   │       ├── dependency-inference.md
│   │       └── testing-requirements.md
│   └── execute-tasks/
│       ├── SKILL.md            # Execution orchestrator (262 lines)
│       ├── references/
│       │   ├── execution-workflow.md
│       │   ├── orchestration.md
│       │   └── verification-patterns.md
│       └── scripts/
│           └── poll-for-results.sh
└── README.md

Component Summary¶

Component	Count	Description
Skills	4	create-spec, analyze-spec, create-tasks, execute-tasks
Agents	4	codebase-explorer, researcher, spec-analyzer, task-executor
Hooks	1	auto-approve-session (PreToolUse)
Reference files	14	Question banks, templates, criteria, patterns
Spec templates	3	High-level, detailed, full-tech

The SDD Pipeline¶

The complete SDD pipeline flows through four skills in sequence. Each skill produces artifacts that feed into the next.

flowchart TD
    subgraph Phase1["Phase 1: Specification"]
        CS["/create-spec"]
        CS -->|"writes"| SPEC["specs/SPEC-{name}.md"]
    end

    subgraph Phase2["Phase 2: Quality Gate"]
        AS["/analyze-spec"]
        SPEC -->|"reads"| AS
        AS -->|"writes"| REPORT["specs/{name}.analysis.md"]
        AS -->|"writes"| HTML["specs/{name}.analysis.html"]
        AS -->|"may update"| SPEC
    end

    subgraph Phase3["Phase 3: Decomposition"]
        CT["/create-tasks"]
        SPEC -->|"reads"| CT
        CT -->|"creates"| TASKS["~/.claude/tasks/{list}/*.json"]
    end

    subgraph Phase4["Phase 4: Execution"]
        ET["/execute-tasks"]
        TASKS -->|"reads"| ET
        ET -->|"spawns"| AGENTS["task-executor agents × N"]
        AGENTS -->|"writes"| CODE["Implementation + Tests"]
        AGENTS -->|"writes"| CTX[".claude/sessions/__live_session__/"]
        ET -->|"updates"| TASKS
    end

    subgraph External["External: Real-Time Monitoring"]
        TASKS -->|"watched by"| TM["Task Manager Dashboard"]
    end

    style CS fill:#7c4dff,color:#fff
    style SPEC fill:#00bcd4,color:#fff
    style AS fill:#7c4dff,color:#fff
    style REPORT fill:#00bcd4,color:#fff
    style HTML fill:#00bcd4,color:#fff
    style CT fill:#7c4dff,color:#fff
    style TASKS fill:#00bcd4,color:#fff
    style ET fill:#7c4dff,color:#fff
    style AGENTS fill:#ff9800,color:#fff
    style CODE fill:#4caf50,color:#fff
    style CTX fill:#4caf50,color:#fff
    style TM fill:#00bcd4,color:#fff

Pipeline Artifacts¶

Phase	Input	Output	Format
create-spec	User interview answers	`specs/SPEC-{name}.md`	Structured markdown PRD
analyze-spec	Spec file	`.analysis.md` + `.analysis.html`	Report + interactive HTML
create-tasks	Spec file	Task JSON files	Claude Code native tasks
execute-tasks	Task list	Code changes + session artifacts	Source code + execution logs

Skill 1: create-spec -- Adaptive Interview¶

Purpose¶

Transforms a product idea into a structured specification through an adaptive, multi-round interview process. The skill adjusts its questioning depth, provides proactive recommendations, and can explore the existing codebase for context.

Workflow (6 Phases)¶

flowchart TD
    P1["Phase 1: Settings Check"] --> P2["Phase 2: Initial Inputs"]
    P2 --> |"Name, Type, Depth, Description"| P3["Phase 3: Adaptive Interview"]

    P3 --> |"For 'new feature' type"| CE["Codebase Exploration"]
    CE --> |"findings"| P3

    P3 --> |"Trigger detected"| RES["External Research"]
    RES --> |"findings"| P3

    P3 --> |"2-5 rounds"| P4["Phase 4: Recommendations Round"]
    P4 --> P5["Phase 5: Pre-Compilation Summary"]
    P5 --> |"User confirms"| P6["Phase 6: Spec Compilation"]
    P6 --> SPEC["specs/SPEC-{name}.md"]

    style P3 fill:#7c4dff,color:#fff
    style CE fill:#ff9800,color:#fff
    style RES fill:#7c4dff,color:#fff
    style SPEC fill:#4caf50,color:#fff

Depth Levels¶

The interview adapts based on the requested depth level:

Level	Rounds	Questions	Focus	Output
High-level overview	2-3	6-10	Problem, goals, key features, success metrics	Executive summary
Detailed specifications	3-4	12-18	Balanced coverage, acceptance criteria, technical constraints	Standard PRD
Full technical documentation	4-5	18-25	Deep probing, API endpoints, data models, performance	Comprehensive tech spec

Question Categories¶

Each interview round covers four categories (depth-adjusted):

Problem & Goals -- Problem statement, success metrics, user personas, business value
Functional Requirements -- Features, user stories, acceptance criteria, workflows
Technical Specs -- Architecture, tech stack, data models, APIs, constraints
Implementation -- Phases, dependencies, risks, out-of-scope items

Proactive Features¶

Recommendation triggers: Scans user responses for patterns that suggest best-practice recommendations (e.g., mentioning "auth" triggers authentication pattern suggestions)
External research: Can invoke the researcher agent for technical documentation, competitive analysis, or compliance requirements
Codebase exploration: For "new feature" type specs, spawns codebase-explorer agents (Sonnet) in parallel to discover existing architecture, patterns, and integration points
Early exit support: Users can wrap up early; spec is marked as Draft (Partial)

Recommendation Triggers

The interview skill monitors user responses for keyword patterns (e.g., "authentication", "scale", "real-time", "compliance") and proactively surfaces best-practice recommendations. This bridges the gap between user intent and implementation knowledge.

Spec Templates¶

Three templates matched to depth levels:

Template	File	Use Case
High-level	`references/templates/high-level.md`	Executive summaries, stakeholder alignment
Detailed	`references/templates/detailed.md`	Standard development specs
Full-tech	`references/templates/full-tech.md`	API specs, data models, architecture

Skill 2: analyze-spec -- Quality Gate¶

Purpose¶

Performs systematic quality analysis on an existing spec, identifying inconsistencies, missing information, ambiguities, and structure issues. Provides both a markdown report and an interactive HTML review interface.

Analysis Categories¶

Category	What It Catches	Example
Inconsistencies	Internal contradictions	Feature named "Search" in one section, "Find" in another
Missing Information	Expected content absent for depth level	Full-tech spec with no API definitions
Ambiguities	Vague or multi-interpretable statements	"Users should be able to search quickly"
Structure Issues	Formatting and organization problems	Missing required sections, orphaned references

Severity Levels¶

Severity	Definition	Example
Critical	Would cause implementation failure	Circular dependencies, undefined core requirements
Warning	Could cause confusion or problems	Vague acceptance criteria, unnamed dependencies
Suggestion	Quality improvement, not blocking	Inconsistent formatting, missing glossary

Output Formats¶

Markdown report ({name}.analysis.md) -- Structured findings with severity, location, and recommendations
Interactive HTML review ({name}.analysis.html) -- Browser-based UI for approving/rejecting findings with copy-prompt workflow

Review Modes¶

flowchart TD
    A["Spec Analyzed"] --> B{"Choose Review Mode"}
    B --> C["Interactive HTML Review"]
    B --> D["CLI Update Mode"]
    B --> E["Reports Only"]

    C --> F["Open in Browser"]
    F --> G["Approve/Reject Findings"]
    G --> H["Copy Prompt"]
    H --> I["Paste Back → Apply Changes"]

    D --> J["Walk Through Each Finding"]
    J --> K{"Apply / Modify / Skip"}
    K --> |"Apply"| L["Edit spec directly"]
    K --> |"Modify"| M["User provides text → Edit"]
    K --> |"Skip"| N["Record reason, move on"]

    E --> O["Keep reports as-is"]

    style C fill:#7c4dff,color:#fff
    style D fill:#ff9800,color:#fff
    style E fill:#455a64,color:#fff

Depth-Aware Analysis

The analyzer detects the spec's depth level (high-level, detailed, or full-tech) and only flags issues appropriate to that level. A high-level spec is never penalized for missing API specifications.

Skill 3: create-tasks -- Spec Decomposition¶

Purpose¶

Transforms a specification into a dependency-ordered set of Claude Code native Tasks, each with categorized acceptance criteria, testing requirements, and metadata for tracking.

Workflow (8 Phases)¶

flowchart TD
    P1["Phase 1: Validate & Load"] --> P2["Phase 2: Detect Depth & Check Existing"]
    P2 --> P3["Phase 3: Analyze Spec"]
    P3 --> P4["Phase 4: Decompose Tasks"]
    P4 --> P5["Phase 5: Infer Dependencies"]
    P5 --> P6["Phase 6: Preview & Confirm"]
    P6 --> |"User approves"| P7{"Existing tasks?"}
    P7 --> |"No"| P7A["Phase 7a: Fresh Create"]
    P7 --> |"Yes"| P7B["Phase 7b: Merge Mode"]
    P7A --> P8["Phase 8: Report"]
    P7B --> P8

    style P4 fill:#7c4dff,color:#fff
    style P5 fill:#ff9800,color:#fff
    style P7B fill:#f44336,color:#fff

Task Decomposition Pattern¶

Each feature is decomposed using a standard layer pattern:

flowchart TD
    F["Feature from Spec"] --> DM["1. Data Model Tasks"]
    F --> API["2. API/Service Tasks"]
    F --> BL["3. Business Logic Tasks"]
    F --> UI["4. UI/Frontend Tasks"]
    F --> TEST["5. Test Tasks"]

    DM --> |"blocks"| API
    API --> |"blocks"| UI
    BL --> |"blocks"| TEST

    style DM fill:#4caf50,color:#fff
    style API fill:#7c4dff,color:#fff
    style BL fill:#ff9800,color:#fff
    style UI fill:#f44336,color:#fff
    style TEST fill:#00bcd4,color:#fff

Depth-Based Granularity¶

Spec Depth	Tasks per Feature	Granularity	Example
High-level	1-2	Feature-level	"Implement user authentication"
Detailed	3-5	Functional decomposition	"Implement login endpoint", "Add password validation"
Full-tech	5-10	Technical decomposition	"Create User model", "Implement POST /auth/login", "Add auth middleware"

Task Structure¶

Each generated task includes:

subject: "Create User data model"              # Imperative mood
description: |
  {What needs to be done}

  **Acceptance Criteria:**

  _Functional:_
  - [ ] Core behavior criteria

  _Edge Cases:_
  - [ ] Boundary condition criteria

  _Error Handling:_
  - [ ] Error scenario criteria

  _Performance:_ (if applicable)
  - [ ] Performance target criteria

  **Testing Requirements:**
  - Unit: Schema validation
  - Integration: Database persistence

  Source: specs/SPEC-Auth.md Section 7.3
activeForm: "Creating User data model"
metadata:
  priority: critical|high|medium|low
  complexity: XS|S|M|L|XL
  spec_path: "specs/SPEC-Auth.md"
  feature_name: "User Authentication"
  task_uid: "specs/SPEC-Auth.md:user-auth:model:001"
  task_group: "user-authentication"

task_group is Required

The task_group field must be set on every task. The /execute-tasks skill relies on metadata.task_group for --task-group filtering and session ID generation. Tasks without task_group will be invisible to group-filtered execution runs.

Merge Mode¶

When re-running on an updated spec, tasks are intelligently merged:

flowchart TD
    RE["Re-run /create-tasks"] --> MATCH{"Match by task_uid"}
    MATCH --> |"Match found"| STATUS{"Task status?"}
    STATUS --> |"completed"| PRESERVE["Preserve — never modify"]
    STATUS --> |"in_progress"| SKIP["Preserve status, optionally update description"]
    STATUS --> |"pending"| UPDATE["Update description if changed"]
    MATCH --> |"No match (new)"| CREATE["Create new task"]
    MATCH --> |"No match (existing)"| OBSOLETE{"Potentially obsolete"}
    OBSOLETE --> KEEP["Keep if user confirms"]
    OBSOLETE --> MARK["Mark completed if user confirms"]

    style PRESERVE fill:#4caf50,color:#fff
    style SKIP fill:#ff9800,color:#fff
    style UPDATE fill:#7c4dff,color:#fff
    style CREATE fill:#1565c0,color:#fff
    style OBSOLETE fill:#f44336,color:#fff

Dependency Inference¶

Dependencies are automatically inferred from three sources:

Layer dependencies: Data Model → API → UI → Tests
Phase dependencies: Phase 2 tasks blocked by Phase 1 completion
Explicit spec dependencies: Section 10 of spec ("requires X" → blockedBy X)
Cross-feature dependencies: Shared data models, services, auth

Circular Dependency Handling

If a circular dependency is detected during task creation, the system breaks the cycle at the weakest link (scored by relationship type) and flags the affected task with needs_review: true in metadata. See the SDD Tools reference for the full inference rules.

Skill 4: execute-tasks -- Autonomous Execution¶

Purpose¶

Orchestrates autonomous task execution with wave-based parallelism, session management, shared execution context, and adaptive verification. After user confirmation, it runs without further interaction until all tasks are complete.

Core Principles¶

Understand before implementing -- Read context, conventions, and earlier task learnings
Follow existing patterns -- Match the codebase's coding style and conventions
Verify against criteria -- Walk through each acceptance criterion, run tests
Report honestly -- PASS only when all Functional criteria and tests pass

Orchestration Loop (10 Steps)¶

flowchart TD
    S1["Step 1: Load Task List"] --> S2["Step 2: Validate State"]
    S2 --> S3["Step 3: Build Execution Plan"]
    S3 --> S4["Step 4: Check Settings"]
    S4 --> S5["Step 5: Initialize Session"]
    S5 --> S6["Step 6: Present Plan & Confirm"]
    S6 --> |"User confirms"| S7["Step 7: Initialize Context"]
    S7 --> S8["Step 8: Execute Loop"]
    S8 --> S9["Step 9: Session Summary"]
    S9 --> S10["Step 10: Update CLAUDE.md"]

    subgraph ExecuteLoop["Step 8: Wave Execution Loop"]
        W1["Snapshot execution_context.md"] --> W2["Mark tasks in_progress"]
        W2 --> W3["Launch N background agents"]
        W3 --> W4["Poll for result files"]
        W4 --> W5{"All complete?"}
        W5 --> |"No"| W4
        W5 --> |"Yes"| W6["Batch-read results"]
        W6 --> W7["Reap agents via TaskOutput"]
        W7 --> W8{"Failed tasks with retries?"}
        W8 --> |"Yes"| W9["Re-launch as background agents"]
        W9 --> W4
        W8 --> |"No"| W10["Merge context files"]
        W10 --> W11["Refresh TaskList"]
        W11 --> W12{"More waves?"}
        W12 --> |"Yes"| W1
        W12 --> |"No"| DONE["Exit loop"]
    end

    S8 --> ExecuteLoop

    style W3 fill:#7c4dff,color:#fff
    style W9 fill:#f44336,color:#fff
    style DONE fill:#4caf50,color:#fff

Wave-Based Parallelism¶

Tasks are organized into waves using topological sort:

flowchart LR
    subgraph Wave1["Wave 1 (No Dependencies)"]
        T1["Task 1: Create User model"]
        T2["Task 2: Create Config model"]
    end

    subgraph Wave2["Wave 2 (Depends on Wave 1)"]
        T3["Task 3: Implement /auth/login"]
        T4["Task 4: Implement /auth/register"]
    end

    subgraph Wave3["Wave 3 (Depends on Wave 2)"]
        T5["Task 5: Build Login UI"]
        T6["Task 6: Add auth middleware"]
    end

    subgraph Wave4["Wave 4 (Depends on Wave 3)"]
        T7["Task 7: Integration tests"]
    end

    T1 --> T3
    T1 --> T4
    T2 --> T3
    T3 --> T5
    T3 --> T6
    T4 --> T5
    T5 --> T7
    T6 --> T7

    style T1 fill:#4caf50,color:#fff
    style T2 fill:#4caf50,color:#fff
    style T3 fill:#7c4dff,color:#fff
    style T4 fill:#7c4dff,color:#fff
    style T5 fill:#ff9800,color:#fff
    style T6 fill:#ff9800,color:#fff
    style T7 fill:#00bcd4,color:#fff

Wave Scheduling

Tasks within a wave run in parallel (up to max_parallel concurrent agents). After each wave completes, newly unblocked tasks form the next wave. Within waves, tasks are sorted by priority (critical > high > medium > low).

Task Executor 4-Phase Workflow¶

Each task is executed by a task-executor agent (Opus) through:

flowchart LR
    P1["Phase 1\nUnderstand"] --> P2["Phase 2\nImplement"]
    P2 --> P3["Phase 3\nVerify"]
    P3 --> P4["Phase 4\nComplete"]

    P1 -.- N1["Read context\nClassify task\nExplore codebase\nPlan implementation"]
    P2 -.- N2["Read target files\nWrite code\nWrite tests\nRun linter"]
    P3 -.- N3["Check criteria\nRun tests\nDetermine status"]
    P4 -.- N4["Update task status\nWrite learnings\nWrite result file"]

    style P1 fill:#7c4dff,color:#fff
    style P2 fill:#4caf50,color:#fff
    style P3 fill:#ff9800,color:#fff
    style P4 fill:#00bcd4,color:#fff

Verification Status¶

Condition	Status	What Happens
All Functional criteria pass + Tests pass	PASS	Task marked `completed`
All Functional pass + Tests pass + Edge/Error/Perf issues	PARTIAL	Task stays `in_progress`, may retry
Any Functional criterion fails	FAIL	Task stays `in_progress`, retry with failure context
Any test failure	FAIL	Task stays `in_progress`, retry with failure context

Adaptive Verification

The executor detects whether a task is spec-generated (has **Acceptance Criteria:** sections, metadata.spec_path, or Source: references) or a general task. Spec-generated tasks are verified criterion-by-criterion. General tasks use an inferred checklist based on the description.

Session Management¶

flowchart TD
    INIT["Initialize Session"] --> DIR[".claude/sessions/__live_session__/"]
    DIR --> EP["execution_plan.md"]
    DIR --> EC["execution_context.md"]
    DIR --> TL["task_log.md"]
    DIR --> PR["progress.md"]
    DIR --> TD["tasks/ (archived JSONs)"]
    DIR --> LOCK[".lock (concurrency guard)"]

    POINTER["~/.claude/tasks/{list}/execution_pointer.md"] --> DIR

    COMPLETE["Session Complete"] --> ARCHIVE[".claude/sessions/{execution_id}/"]
    DIR --> |"move contents"| ARCHIVE

    style DIR fill:#7c4dff,color:#fff
    style POINTER fill:#ff9800,color:#fff
    style ARCHIVE fill:#4caf50,color:#fff

Key Execution Features¶

Feature	Description
Background agents	Agents run via `run_in_background: true`, returning ~3 lines instead of full output
Result file protocol	Each agent writes a compact `result-task-{id}.md` (~18 lines) as completion signal
Per-task context isolation	Each agent writes to `context-task-{id}.md`, orchestrator merges after wave
Configurable parallelism	Default 5 concurrent agents; overridable via `--max-parallel`
Configurable retries	Default 3 attempts; overridable via `--retries`
Retry with context	Failed tasks include previous failure details for different approach
Interrupted session recovery	Stale sessions archived; in_progress tasks reset to pending
Concurrency guard	`.lock` file prevents concurrent execution sessions
Token usage tracking	Per-task `duration_ms` and `total_tokens` extracted via TaskOutput

Result File Protocol

The result file protocol is a key optimization. Instead of consuming the full agent output (which can be thousands of tokens), the orchestrator polls for compact result-task-{id}.md files (~18 lines each). This achieves a 79% context reduction per wave — critical for keeping the orchestrator within its context window across many waves.

Agent Inventory¶

flowchart TD
    subgraph Agents["SDD Tools Agents"]
        CE["codebase-explorer\n(Sonnet)"]
        R["researcher\n(Opus)"]
        SA["spec-analyzer\n(Opus)"]
        TE["task-executor\n(Opus)"]
    end

    CS["/create-spec"] --> |"spawns for 'new feature'"| CE
    CS --> |"spawns for research"| R
    AS["/analyze-spec"] --> |"launches"| SA
    ET["/execute-tasks"] --> |"launches × N per wave"| TE

    CE --> |"Read, Glob, Grep, Bash"| CODEBASE["Codebase"]
    R --> |"WebSearch, WebFetch, Context7"| WEB["External Sources"]
    SA --> |"AskUserQuestion, Read, Write, Edit"| SPEC["Spec Files"]
    TE --> |"Read, Write, Edit, Glob, Grep, Bash"| CODE["Source Code"]

    style CE fill:#7c4dff,color:#fff
    style R fill:#00bcd4,color:#fff
    style SA fill:#f44336,color:#fff
    style TE fill:#4caf50,color:#fff

Agent	Model	Tools	Role	Spawned By
codebase-explorer	Sonnet	Read, Glob, Grep, Bash	Explores codebase for patterns and architecture	`/create-spec` (parallel, for "new feature" type)
researcher	Opus	WebSearch, WebFetch, Context7	Technical and domain research for specs	`/create-spec` (on-demand or proactive)
spec-analyzer	Opus	AskUserQuestion, Read, Write, Edit, Glob, Grep	Quality analysis with interactive resolution	`/analyze-spec`
task-executor	Opus	Read, Write, Edit, Glob, Grep, Bash, TaskGet/Update/List	Autonomous 4-phase task implementation	`/execute-tasks` (N per wave, background)

Model Tiering Rationale

Sonnet for codebase-explorer: These agents perform broad, parallelizable search work. Sonnet is cost-effective for exploration where reasoning depth is less critical than breadth. Opus for researcher, spec-analyzer, task-executor: These agents require deep reasoning — synthesizing research findings, analyzing spec quality, and implementing code with verification.

Hooks & Automation¶

auto-approve-session.sh¶

Property	Value
Event	`PreToolUse`
Triggers	Write, Edit, Bash operations
Purpose	Auto-approves file operations within `.claude/sessions/` directories
Timeout	5 seconds

Why This Hook Exists

This hook enables task-executor agents to write execution context files, result files, and session artifacts without requiring user approval for each operation. Without it, every file write during autonomous execution would pause for user confirmation — breaking the autonomous execution loop.

End-to-End Workflow Walkthrough¶

Example: Building a User Authentication Feature¶

sequenceDiagram
    participant U as Developer
    participant CS as /create-spec
    participant CE as codebase-explorer
    participant AS as /analyze-spec
    participant SA as spec-analyzer
    participant CT as /create-tasks
    participant ET as /execute-tasks
    participant TE as task-executor × N
    participant TM as Task Manager

    Note over U,TM: Phase 1: Specification

    U->>CS: /create-spec
    CS->>U: What type? "New feature"
    CS->>U: What depth? "Detailed"
    CS->>CE: Explore auth patterns (Sonnet × 2)
    CE-->>CS: Architecture findings
    CS->>U: Interview rounds (3-4 rounds, 12-18 questions)
    CS->>U: Recommendations round
    CS->>U: Pre-compilation summary — confirm?
    U->>CS: Confirmed
    CS-->>U: specs/SPEC-User-Auth.md created

    Note over U,TM: Phase 2: Quality Gate (Optional)

    U->>AS: /analyze-spec specs/SPEC-User-Auth.md
    AS->>SA: Analyze spec (Opus)
    SA-->>AS: 8 findings (2 critical, 4 warning, 2 suggestion)
    AS->>U: Choose review mode
    U->>AS: CLI Update Mode
    AS->>U: Walk through each finding
    U->>AS: Apply 6, Skip 2
    AS-->>U: Spec updated, report saved

    Note over U,TM: Phase 3: Decomposition

    U->>CT: /create-tasks specs/SPEC-User-Auth.md
    CT->>CT: Detect depth: Detailed
    CT->>CT: Extract features, decompose, infer dependencies
    CT->>U: Preview: 15 tasks, 22 dependencies
    U->>CT: Confirmed
    CT-->>U: 15 tasks created with dependency chains
    CT-->>TM: Tasks visible in Kanban board

    Note over U,TM: Phase 4: Execution

    U->>ET: /execute-tasks --task-group user-authentication
    ET->>ET: Build wave plan: Wave 1 (3 tasks), Wave 2 (5), Wave 3 (4), Wave 4 (3)
    ET->>U: Execution plan — confirm?
    U->>ET: Confirmed

    loop Each Wave
        ET->>TE: Launch N background agents
        TE->>TE: Understand → Implement → Verify → Complete
        TE-->>ET: result-task-{id}.md (PASS/PARTIAL/FAIL)
        ET->>ET: Merge context, form next wave
        ET-->>TM: Task status updates (real-time)
    end

    ET-->>U: Session summary: 14 PASS, 1 PARTIAL

Step-by-Step¶

/create-spec -- Developer initiates spec creation. The skill asks about type ("new feature"), depth ("detailed"), and runs a 3-4 round interview. For new features, it spawns codebase explorers to understand existing patterns. It produces specs/SPEC-User-Auth.md.
/analyze-spec specs/SPEC-User-Auth.md (optional but recommended) -- The spec is analyzed for quality issues. The developer reviews findings via CLI or HTML interface, fixing critical issues before task generation.
/create-tasks specs/SPEC-User-Auth.md -- The spec is decomposed into 15 dependency-ordered tasks. Each task has categorized acceptance criteria (Functional, Edge Cases, Error Handling, Performance), testing requirements, and metadata. The developer previews and confirms.
/execute-tasks --task-group user-authentication -- The orchestrator builds a wave plan and launches background task-executor agents in parallel. Each agent reads the execution context, implements the task, verifies against acceptance criteria, and reports results. The Task Manager dashboard shows real-time progress.

Data Flow Diagrams¶

Artifact Flow Through the Pipeline¶

flowchart TD
    subgraph Inputs
        USER["User's Idea"]
        CODEBASE["Existing Codebase"]
    end

    subgraph SpecPhase["Specification Phase"]
        INTERVIEW["Interview Answers"]
        EXPLORE["Exploration Findings"]
        RESEARCH["Research Findings"]
        RECS["Recommendations"]
    end

    subgraph Artifacts
        SPEC["SPEC-{name}.md"]
        ANALYSIS["Analysis Report + HTML"]
        TASKS["Task JSON Files"]
        CONTEXT["Execution Context"]
        CODE["Implemented Code"]
        LOGS["Session Logs"]
    end

    USER --> INTERVIEW
    CODEBASE --> EXPLORE
    INTERVIEW --> SPEC
    EXPLORE --> SPEC
    RESEARCH --> SPEC
    RECS --> SPEC
    SPEC --> ANALYSIS
    ANALYSIS --> |"fixes"| SPEC
    SPEC --> TASKS
    TASKS --> CONTEXT
    CONTEXT --> CODE
    CODE --> LOGS

    style SPEC fill:#7c4dff,color:#fff,stroke-width:3px
    style TASKS fill:#4caf50,color:#fff,stroke-width:3px
    style CODE fill:#ff9800,color:#fff,stroke-width:3px

flowchart TD
    subgraph Wave1["Wave 1"]
        A1["Agent 1"] --> |"writes"| C1["context-task-1.md"]
        A2["Agent 2"] --> |"writes"| C2["context-task-2.md"]
    end

    subgraph Merge1["Between Waves"]
        C1 --> EC["execution_context.md"]
        C2 --> EC
    end

    subgraph Wave2["Wave 2"]
        EC --> |"snapshot read"| A3["Agent 3"]
        EC --> |"snapshot read"| A4["Agent 4"]
        A3 --> |"writes"| C3["context-task-3.md"]
        A4 --> |"writes"| C4["context-task-4.md"]
    end

    subgraph Merge2["After Wave 2"]
        C3 --> EC2["execution_context.md\n(merged)"]
        C4 --> EC2
    end

    style EC fill:#7c4dff,color:#fff,stroke-width:2px
    style EC2 fill:#7c4dff,color:#fff,stroke-width:2px

Context Isolation Pattern

Each agent writes to an isolated context-task-{id}.md file during execution. After all agents in a wave complete, the orchestrator merges per-task files into the shared execution_context.md. This eliminates write contention while letting later tasks benefit from earlier discoveries — a pattern also used by the core-tools deep-analysis skill.

Use Cases & Benefits¶

Use Cases¶

Use Case	How SDD Tools Helps
Greenfield feature development	Structured spec → decomposed tasks → parallel autonomous execution
Complex multi-component features	Dependency inference ensures correct build order; wave parallelism maximizes throughput
Team alignment	Spec serves as single source of truth; analyze-spec catches ambiguities before coding starts
Iterative spec refinement	Merge mode preserves completed work when specs evolve; analyze-spec provides quality gate
Compliance-sensitive projects	Research agent gathers regulatory requirements; specs document acceptance criteria for audit
Reducing rework	Verification against acceptance criteria catches issues before they compound
Onboarding new team members	Specs document the "why" behind features; execution context captures implementation decisions

Benefits for Developers¶

Benefit	Without SDD Tools	With SDD Tools
Requirements capture	Ad-hoc prompts, lost context	Structured spec with testable criteria
Task planning	Manual decomposition	Automatic dependency-aware decomposition
Parallel execution	Sequential, one task at a time	Wave-based concurrent agent execution
Verification	Manual review or trust	Automated criterion-by-criterion verification
Knowledge sharing	Each task starts from scratch	Shared execution context across tasks
Progress visibility	Checking git log	Real-time Task Manager dashboard
Spec evolution	Start over or manual diff	Merge mode preserves completed work
Quality assurance	Post-hoc review	Pre-implementation spec analysis

Integration with Other Plugins¶

Standalone Design¶

sdd-tools is a standalone plugin -- it has no external plugin dependencies. This was achieved by giving sdd-tools its own codebase-explorer agent instead of relying on core-tools.

Consumed By Other Plugins¶

Plugin	How It Uses sdd-tools
tdd-tools	`/execute-tdd-tasks` routes non-TDD tasks to the `task-executor` agent from sdd-tools
tdd-tools	`/create-tdd-tasks` reads tasks created by `/create-tasks` and generates TDD pairs

TDD Extension

The SDD pipeline integrates seamlessly with the TDD Tools plugin. After /create-tasks, run /create-tdd-tasks to generate RED-GREEN test pairs, then /execute-tdd-tasks for TDD-aware execution. See the TDD Tools documentation for the full TDD workflow.

Integration with Task Manager¶

The Task Manager dashboard provides real-time visualization:

flowchart LR
    ET["/execute-tasks"] --> |"creates/updates"| JSON["~/.claude/tasks/*.json"]
    JSON --> |"Chokidar watches"| FW["FileWatcher"]
    FW --> |"EventEmitter"| SSE["SSE Route"]
    SSE --> |"stream"| CLIENT["Browser"]
    CLIENT --> |"invalidateQueries"| TQ["TanStack Query"]
    TQ --> KB["Kanban Board"]

    style ET fill:#ff9800,color:#fff
    style JSON fill:#4caf50,color:#fff
    style KB fill:#7c4dff,color:#fff

Configuration & Settings¶

Settings are configured in .claude/agent-alchemy.local.md (not committed):

Setting	Type	Default	Description
`execute-tasks.max-parallel`	number	5	Maximum concurrent agents per wave
Custom output path	string	`specs/`	Directory for spec output
Author name	string	--	Attribution in spec metadata

Command-Line Arguments¶

Skill	Arguments	Description
`/create-spec`	(none)	Starts interactive interview
`/analyze-spec`	`[spec-path]`	Path to spec file
`/create-tasks`	`[spec-path]`	Path to spec file
`/execute-tasks`	`[task-id] [--task-group <group>] [--retries <n>] [--max-parallel <n>]`	Flexible execution control

Reference File Inventory¶

Skill	File	Purpose	Contents
create-spec	`interview-questions.md`	Question bank	Questions organized by category and depth
create-spec	`recommendation-triggers.md`	Trigger patterns	Keyword patterns for proactive recommendations
create-spec	`recommendation-format.md`	Recommendation templates	How to present recommendations to users
create-spec	`codebase-exploration.md`	Exploration procedure	4-step codebase exploration workflow
create-spec	`templates/high-level.md`	Spec template	Streamlined executive overview
create-spec	`templates/detailed.md`	Spec template	Standard PRD with all sections
create-spec	`templates/full-tech.md`	Spec template	Extended with API specs, data models
analyze-spec	`analysis-criteria.md`	Depth checklists	What to check at each depth level
analyze-spec	`common-issues.md`	Issue patterns	Known issue patterns with examples
analyze-spec	`report-template.md`	Report format	Markdown report structure
analyze-spec	`html-review-guide.md`	HTML generation	Instructions for HTML review output
create-tasks	`decomposition-patterns.md`	Decomposition rules	Feature-to-task decomposition patterns
create-tasks	`dependency-inference.md`	Dependency rules	Automatic dependency inference logic
create-tasks	`testing-requirements.md`	Test mappings	Task type → test type mappings
execute-tasks	`orchestration.md`	Orchestration loop	Full 10-step execution procedure
execute-tasks	`execution-workflow.md`	Phase workflow	4-phase agent workflow details
execute-tasks	`verification-patterns.md`	Verification rules	Task classification and pass/fail criteria

SDD Tools Deep Dive¶

Executive Summary¶

What is Spec-Driven Development?¶

Plugin Architecture¶

Directory Structure¶

Component Summary¶

The SDD Pipeline¶

Pipeline Artifacts¶

Skill 1: create-spec -- Adaptive Interview¶

Purpose¶

Workflow (6 Phases)¶

Depth Levels¶

Question Categories¶

Proactive Features¶

Spec Templates¶

Skill 2: analyze-spec -- Quality Gate¶

Purpose¶

Analysis Categories¶

Severity Levels¶

Output Formats¶

Review Modes¶

Skill 3: create-tasks -- Spec Decomposition¶

Purpose¶

Workflow (8 Phases)¶

Task Decomposition Pattern¶

Depth-Based Granularity¶

Task Structure¶

Merge Mode¶

Dependency Inference¶

Skill 4: execute-tasks -- Autonomous Execution¶

Purpose¶

Core Principles¶

Orchestration Loop (10 Steps)¶

Wave-Based Parallelism¶

Task Executor 4-Phase Workflow¶

Verification Status¶

Session Management¶

Key Execution Features¶

Agent Inventory¶

Hooks & Automation¶

auto-approve-session.sh¶

End-to-End Workflow Walkthrough¶

Example: Building a User Authentication Feature¶

Step-by-Step¶

Data Flow Diagrams¶

Artifact Flow Through the Pipeline¶

Execution Context Sharing¶

Use Cases & Benefits¶

Use Cases¶

Benefits for Developers¶

Integration with Other Plugins¶

Standalone Design¶

Consumed By Other Plugins¶

Integration with Task Manager¶

Configuration & Settings¶

Command-Line Arguments¶

Reference File Inventory¶