SDD Tools¶

Spec-Driven Development (SDD) Tools is the core workflow engine of Agent Alchemy. It provides a structured pipeline that transforms ideas into specifications, decomposes specifications into executable tasks, and runs autonomous implementation with wave-based parallelism.

Plugin: agent-alchemy-sdd-tools | Version: 0.2.0 | Skills: 4 | Agents: 4

Deep Dive Available

For a comprehensive walkthrough of the SDD pipeline — including end-to-end workflow examples, data flow diagrams, execution context sharing, and architectural deep-dives into each skill — see the SDD Tools Deep Dive.

The SDD Pipeline¶

The SDD pipeline is a four-stage workflow. Each stage produces an artifact that feeds into the next, creating a traceable chain from requirements to working code.

graph LR
    A["/create-spec"] -->|"specs/SPEC-{name}.md"| B["/analyze-spec"]
    B -->|"Quality-checked spec"| C["/create-tasks"]
    C -->|"Claude Code Tasks"| D["/execute-tasks"]
    D -->|"Implementation"| E["Task Manager"]

    style A fill:#7c4dff,color:#fff
    style B fill:#7c4dff,color:#fff
    style C fill:#7c4dff,color:#fff
    style D fill:#7c4dff,color:#fff
    style E fill:#00bcd4,color:#fff

Stage	Skill	Input	Output
1. Specify	`/create-spec`	User interview	`specs/SPEC-{name}.md`
2. Analyze	`/analyze-spec`	Spec file	`.analysis.md` + `.analysis.html`
3. Decompose	`/create-tasks`	Spec file	Claude Code Tasks with metadata
4. Execute	`/execute-tasks`	Task list	Implemented code, session artifacts

TDD Variant (via tdd-tools)

The tdd-tools plugin extends this pipeline with test-first development. After /create-tasks, run /create-tdd-tasks (tdd-tools) to generate RED-GREEN test pairs, then /execute-tdd-tasks (tdd-tools) for TDD-aware execution. See the TDD Tools documentation for details.

Skills¶

`/create-spec` -- Adaptive Interview¶

Creates structured specifications through a multi-round adaptive interview. The interview adjusts its depth, question count, and topic coverage based on the requested detail level.

Invocation:

/agent-alchemy-sdd:create-spec

Depth Levels¶

Level	Rounds	Questions	Focus
High-level overview	2-3	6-10	Problem, goals, key features, success metrics
Detailed specifications	3-4	12-18	Balanced coverage with acceptance criteria
Full technical documentation	4-5	18-25	API endpoints, data models, architecture

Six-Phase Workflow¶

Settings Check -- Loads configuration from .claude/agent-alchemy.local.md
Initial Inputs -- Gathers spec name, type (new product or new feature), depth, and description
Adaptive Interview -- Multi-round depth-aware interview with proactive recommendations
Recommendations Round -- Dedicated round for accumulated best-practice suggestions
Pre-Compilation Summary -- Presents gathered requirements for user confirmation
Spec Compilation -- Generates spec from depth-appropriate template and writes to file

Key Features¶

Proactive recommendations -- Detects patterns in responses (authentication, scale, compliance) and suggests industry best practices
Codebase exploration -- For "new feature" type specs, optionally runs deep-analysis from core-tools to understand existing patterns, conventions, and integration points
External research -- Invokes the researcher agent for on-demand technical documentation lookup or proactive compliance/regulatory research (max 2 proactive calls per interview)
Early exit -- Gracefully handles requests to wrap up early, generating a Draft (Partial) spec with available information
Three spec templates -- Dedicated templates for each depth level: high-level.md, detailed.md, full-tech.md

Question Categories

The interview covers four categories: Problem & Goals, Functional Requirements, Technical Specs, and Implementation. The depth level determines how deeply each category is probed.

Output Format¶

Specs follow a structured format with prioritized requirements:

specs/SPEC-User-Auth.md

# User Authentication PRD

**Status**: Draft
**Spec Type**: New feature
**Spec Depth**: Detailed specifications
**Description**: OAuth2-based authentication for the dashboard

## 1. Overview
...

### REQ-001: User Login

**Priority**: P0 (Critical)

**Description**: Users can authenticate via email/password or OAuth2 providers.

**Acceptance Criteria**:
- [ ] Login form validates email format
- [ ] OAuth2 flow completes within 3 seconds
- [ ] Failed attempts are rate-limited after 5 tries

`/analyze-spec` -- Quality Review¶

Performs a comprehensive quality analysis of an existing spec, checking for inconsistencies, missing information, ambiguities, and structure issues. Generates both a markdown report and an interactive HTML review.

Invocation:

/agent-alchemy-sdd:analyze-spec specs/SPEC-User-Authentication.md

Analysis Categories¶

Category	What It Checks
Inconsistencies	Contradictory requirements, naming mismatches, priority conflicts
Missing Information	Absent sections, undefined terms, features without acceptance criteria
Ambiguities	Vague quantifiers ("fast", "scalable"), ambiguous pronouns, open-ended lists
Structure Issues	Missing sections, misplaced content, formatting inconsistencies

Severity Levels¶

Severity	Meaning	Example
Critical	Would cause implementation to fail	Auth required but no auth spec defined
Warning	Could cause confusion	"Search should be fast" without metrics
Suggestion	Quality improvement	Inconsistent user story formatting

Depth-Aware Analysis¶

The analyzer detects the spec's depth level (high-level, detailed, or full-tech) and only flags issues appropriate to that level. A high-level spec is never penalized for missing API specifications.

Three Review Modes¶

After generating the report, the analyzer offers three review modes:

Interactive HTML Review -- Open .analysis.html in a browser, approve/reject findings, then copy a prompt back to apply changes
CLI Update Mode -- Walk through each finding interactively with Apply, Modify, or Skip options
Reports Only -- Keep the .analysis.md and .analysis.html files without interactive resolution

Output Files¶

specs/SPEC-{name}.analysis.md -- Structured markdown report with findings
specs/SPEC-{name}.analysis.html -- Self-contained interactive HTML review page

`/create-tasks` -- Spec-to-Task Decomposition¶

Transforms a specification into dependency-ordered Claude Code Tasks with rich metadata, categorized acceptance criteria, and testing requirements.

Invocation:

/agent-alchemy-sdd:create-tasks specs/SPEC-User-Authentication.md

Eight-Phase Workflow¶

Validate & Load -- Reads the spec, loads decomposition references
Detect Depth & Check Existing -- Detects spec depth, checks for existing tasks (merge mode)
Analyze Spec -- Extracts features, requirements, priorities from each spec section
Decompose Tasks -- Breaks features into atomic tasks using the layer pattern
Infer Dependencies -- Maps blocking relationships between tasks
Preview & Confirm -- Shows a summary and gets user approval
Create Tasks -- Creates tasks via TaskCreate/TaskUpdate
Error Handling -- Handles parsing issues, circular dependencies, missing information

Task Granularity by Depth¶

Spec Depth	Tasks per Feature	Granularity
High-level	1-2	Feature-level deliverables
Detailed	3-5	Functional decomposition
Full-tech	5-10	Technical decomposition (models, endpoints, middleware)

Layer Decomposition Pattern¶

Each feature is decomposed following a standard layer pattern:

graph TD
    DM["1. Data Model Tasks"] --> API["2. API/Service Tasks"]
    API --> BL["3. Business Logic Tasks"]
    BL --> UI["4. UI/Frontend Tasks"]
    UI --> T["5. Test Tasks"]

    style DM fill:#7c4dff,color:#fff
    style API fill:#7c4dff,color:#fff
    style BL fill:#7c4dff,color:#fff
    style UI fill:#7c4dff,color:#fff
    style T fill:#00bcd4,color:#fff

Task Metadata¶

Every task carries structured metadata for filtering, execution, and traceability:

Field	Description	Example
`priority`	Mapped from spec P0-P3	`critical`, `high`, `medium`, `low`
`complexity`	Estimated size	`XS`, `S`, `M`, `L`, `XL`
`spec_path`	Source specification	`specs/SPEC-Auth.md`
`source_section`	Spec section reference	`5.1 User Authentication`
`feature_name`	Parent feature	`User Authentication`
`task_uid`	Unique ID for merge tracking	`specs/SPEC-Auth.md:user-auth:api-login:001`
`task_group`	Group slug from spec title	`user-authentication`

task_group is Required

The task_group field must be set on every task. The /execute-tasks skill relies on metadata.task_group for --task-group filtering and session ID generation. Tasks without task_group will be invisible to group-filtered execution runs.

Acceptance Criteria Categories¶

Each task includes categorized acceptance criteria:

Category	What It Covers
Functional	Core behavior, expected outputs, state changes
Edge Cases	Boundaries, empty/null values, max values, concurrent operations
Error Handling	Invalid input, failures, timeouts, graceful degradation
Performance	Response times, throughput, resource limits (when applicable)

Merge Mode¶

When re-running /create-tasks on an updated spec, the skill uses task_uid to intelligently merge:

Existing Status	Action
`pending`	Updated if spec changed
`in_progress`	Preserved, optionally updated
`completed`	Never modified
New requirement	Created as new task
No longer in spec	Flagged as potentially obsolete

`/execute-tasks` -- Wave-Based Execution¶

Orchestrates autonomous task execution with dependency-aware wave scheduling, per-task agent isolation, shared execution context, and adaptive verification.

Invocation:

/agent-alchemy-sdd:execute-tasks --task-group user-authentication

Arguments:

Argument	Default	Description
`[task-id]`	--	Execute a single specific task
`--task-group <group>`	--	Filter to tasks matching this group
`--retries <n>`	3	Retry attempts for failed tasks
`--max-parallel <n>`	5	Max concurrent agents per wave

4-Phase Task Workflow¶

Each task is executed by a task-executor agent (Opus model) through four phases:

graph LR
    U["Phase 1: Understand"] --> I["Phase 2: Implement"]
    I --> V["Phase 3: Verify"]
    V --> C["Phase 4: Complete"]

    style U fill:#7c4dff,color:#fff
    style I fill:#7c4dff,color:#fff
    style V fill:#7c4dff,color:#fff
    style C fill:#7c4dff,color:#fff

Phase	What Happens
Understand	Load execution context, classify task, parse acceptance criteria, explore affected files, read `CLAUDE.md` for conventions
Implement	Read target files, make changes following project patterns (data -> service -> interface -> tests), run mid-implementation checks
Verify	Walk acceptance criteria (spec-generated) or infer checklist (general), run tests, determine PASS/PARTIAL/FAIL
Complete	Update task status, write learnings to per-task context file, return structured verification report

Verification Status Rules¶

Condition	Status
All Functional criteria pass + Tests pass	PASS
All Functional pass + Tests pass + Edge/Error/Perf issues	PARTIAL
Any Functional criterion fails	FAIL
Any test failure	FAIL

Adaptive Verification

The executor detects whether a task is spec-generated (has **Acceptance Criteria:** sections, metadata.spec_path, or Source: references) or a general task. Spec-generated tasks are verified criterion-by-criterion. General tasks use an inferred checklist based on the description.

Wave-Based Execution¶

Tasks are organized into waves based on dependency levels:

graph TD
    subgraph "Wave 1 (no dependencies)"
        A1["Create User model"]
        A2["Create Session model"]
    end
    subgraph "Wave 2 (depends on Wave 1)"
        B1["Implement login endpoint"]
        B2["Implement register endpoint"]
    end
    subgraph "Wave 3 (depends on Wave 2)"
        C1["Build login UI"]
        C2["Add auth middleware"]
    end

    A1 --> B1
    A1 --> B2
    A2 --> B1
    B1 --> C1
    B1 --> C2

Tasks within a wave run in parallel (up to max_parallel concurrent agents)
After each wave, newly unblocked tasks form the next wave
Failed tasks with retries remaining are re-launched immediately within the wave
Within waves, tasks are sorted by priority (critical > high > medium > low)

Session Management¶

Each execution creates a session directory at .claude/sessions/__live_session__/ containing:

File	Purpose
`execution_plan.md`	Saved wave plan with task assignments
`execution_context.md`	Shared learnings: patterns, decisions, issues, file map
`task_log.md`	Per-task results with status, duration, and token usage
`progress.md`	Real-time progress tracking for the Task Manager
`tasks/`	Subdirectory for archived completed task files

Execution Context Sharing

Each agent writes learnings to an isolated context-task-{id}.md file. After all agents in a wave complete, the orchestrator merges per-task files into the shared execution_context.md. This eliminates write contention while letting later tasks benefit from earlier discoveries.

Key Behaviors¶

Autonomous execution loop -- After the user confirms the plan, the loop runs without interruption
Configurable parallelism -- Set --max-parallel 1 for sequential execution
Retry with context -- Each retry includes the previous attempt's failure details
Interrupted session recovery -- Stale sessions are detected, archived, and in_progress tasks reset to pending
Concurrency guard -- A .lock file prevents multiple execution sessions from running simultaneously
CLAUDE.md updates -- After execution, meaningful project-wide changes are added to CLAUDE.md

Dependency Inference¶

The /create-tasks skill automatically infers blocking relationships between tasks using several strategies.

Layer-Based Dependencies¶

Higher layers depend on lower layers:

Layer 0: Infrastructure/Config
    |
Layer 1: Data Models
    |
Layer 2: API/Service
    |
Layer 3: Business Logic
    |
Layer 4: UI/Frontend
    |
Layer 5: Integration/E2E Tests

Task Type	Depends On	Blocks
Data Model	Infrastructure tasks	API tasks, Service tasks
API Endpoint	Data Model it uses	UI tasks calling it
UI Component	API endpoint it calls	E2E tests
Unit Test	Implementation it tests	Nothing

Phase Dependencies¶

When the spec defines implementation phases, all tasks in Phase N are blocked by completion of Phase N-1.

Cross-Feature Dependencies¶

Tasks that share data models, services, or authentication requirements are linked through their common dependencies. For example, if both Feature A and Feature B use the User model, both depend on the "Create User model" task.

Keyword-Based Detection¶

Dependency signals are detected from task descriptions:

"using {Entity}" -- depends on Entity model task
"calls {endpoint}" -- depends on endpoint task
"extends {Component}" -- depends on component task
"requires {Setup}" -- depends on setup task

Circular Dependency Handling¶

If a circular dependency is detected during task creation or TDD pair insertion, the system breaks the cycle at the weakest link (scored by relationship type) and flags the affected task with needs_review: true in metadata.

Agents¶

researcher¶

Property	Value
Model	Opus
Tools	WebSearch, WebFetch, Context7 (resolve-library-id, query-docs)
Used by	`/create-spec`

Researches technical documentation, domain knowledge, competitive landscape, and compliance requirements during the spec interview. Uses Context7 as the primary source for library/framework documentation, falling back to web search for general topics.

spec-analyzer¶

Property	Value
Model	Opus
Tools	AskUserQuestion, Read, Write, Edit, Glob, Grep
Used by	`/analyze-spec`

Performs systematic analysis across four categories (inconsistencies, missing information, ambiguities, structure issues) and guides users through resolving findings interactively. Read-only access to the codebase with write access limited to the spec and report files.

codebase-explorer¶

Property	Value
Model	Sonnet
Tools	Read, Glob, Grep, Bash
Used by	`/create-spec` (optional, for new feature type specs)

Explores codebases to discover architecture, patterns, and feature-relevant code during spec creation. Spawned in parallel with specific focus areas when the user creates a spec for a new feature that needs codebase context. This is a lightweight, non-team agent distinct from core-tools' code-explorer — it works independently and returns its findings as a final message without team coordination.

task-executor¶

Property	Value
Model	Opus
Tools	Read, Write, Edit, Glob, Grep, Bash, TaskGet, TaskUpdate, TaskList
Used by	`/execute-tasks`, `/execute-tdd-tasks` (for non-TDD tasks)

Executes a single task autonomously through the 4-phase workflow (Understand, Implement, Verify, Complete). Works without user interaction, writes learnings to per-task context files, and reports honest verification results.

Task Manager Integration¶

The /execute-tasks skill (and /execute-tdd-tasks from tdd-tools) produces session artifacts that integrate with the Agent Alchemy Task Manager -- a real-time Kanban dashboard.

How It Works¶

The execution orchestrator writes progress.md to .claude/sessions/__live_session__/ as tasks execute
Task status updates flow through Claude Code's native Task system (~/.claude/tasks/)
An execution_pointer.md file at ~/.claude/tasks/{list_id}/ links the Task Manager to the active session
The Task Manager watches ~/.claude/tasks/ via Chokidar and pushes updates to the browser via SSE

What You See¶

Task cards move between Pending, In Progress, and Completed columns in real time
Wave progress shows which execution wave is active
Session files (execution plan, context, task log) are accessible from the session directory

Running the Task Manager

Start the Task Manager alongside your execution session for real-time visibility:

pnpm dev:task-manager
# Open http://localhost:3030

Configuration¶

Settings are stored in .claude/agent-alchemy.local.md (not committed to version control).

Available Settings¶

Setting	Default	Description
`author`	--	Author name included in spec metadata
`spec-output-path`	`specs/`	Directory for generated spec files
`execute-tasks.max_parallel`	`5`	Max concurrent agents per wave (overridden by `--max-parallel`)

Example Settings File¶

.claude/agent-alchemy.local.md

# Agent Alchemy Settings

## General
- author: Jane Smith
- spec-output-path: docs/specs/

## Execution
- execute-tasks.max_parallel: 3

Hooks¶

SDD Tools includes a single PreToolUse hook that auto-approves file operations within execution session directories, enabling autonomous task execution without permission prompts.

Hook	Event	Matcher	Timeout
`auto-approve-session.sh`	PreToolUse	`Write\\|Edit\\|Bash`	5s

What it approves:

Write/Edit operations targeting $HOME/.claude/tasks/*/execution_pointer.md
Write/Edit operations targeting any file inside .claude/sessions/
Bash commands targeting .claude/sessions/

All other operations pass through to the normal permission flow.

Directory Structure¶

sdd-tools/
├── agents/
│   ├── researcher.md              # Technical/domain research agent
│   ├── spec-analyzer.md           # Spec quality analysis agent
│   └── task-executor.md           # Autonomous implementation agent
├── hooks/
│   ├── hooks.json                 # PreToolUse hook configuration
│   └── auto-approve-session.sh    # Session file auto-approve script
└── skills/
    ├── create-spec/
    │   ├── SKILL.md               # Adaptive interview workflow (664 lines)
    │   └── references/
    │       ├── interview-questions.md
    │       ├── recommendation-triggers.md
    │       ├── recommendation-format.md
    │       └── templates/
    │           ├── high-level.md
    │           ├── detailed.md
    │           └── full-tech.md
    ├── analyze-spec/
    │   ├── SKILL.md               # Quality analysis workflow
    │   ├── references/
    │   │   ├── analysis-criteria.md
    │   │   ├── common-issues.md
    │   │   ├── html-review-guide.md
    │   │   └── report-template.md
    │   └── templates/
    │       └── review-template.html
    ├── create-tasks/
    │   ├── SKILL.md               # Task decomposition (653 lines)
    │   └── references/
    │       ├── decomposition-patterns.md
    │       ├── dependency-inference.md
    │       └── testing-requirements.md
    └── execute-tasks/
        ├── SKILL.md               # Execution orchestrator (262 lines)
        └── references/
            ├── orchestration.md
            ├── execution-workflow.md
            └── verification-patterns.md

Quick Reference¶

Common Workflows¶

Standard PipelineTDD Pipeline (via tdd-tools)Single TaskRe-run After Spec Update

# 1. Create a spec
/agent-alchemy-sdd:create-spec

# 2. Analyze for quality issues
/agent-alchemy-sdd:analyze-spec specs/SPEC-My-Feature.md

# 3. Generate tasks
/agent-alchemy-sdd:create-tasks specs/SPEC-My-Feature.md

# 4. Execute all tasks
/agent-alchemy-sdd:execute-tasks --task-group my-feature

# 1-3. Same as standard pipeline
/agent-alchemy-sdd:create-spec
/agent-alchemy-sdd:analyze-spec specs/SPEC-My-Feature.md
/agent-alchemy-sdd:create-tasks specs/SPEC-My-Feature.md

# 4. Add TDD test pairs (tdd-tools plugin)
/agent-alchemy-tdd:create-tdd-tasks --task-group my-feature

# 5. Execute with TDD enforcement (tdd-tools plugin)
/agent-alchemy-tdd:execute-tdd-tasks --task-group my-feature

# Execute one specific task
/agent-alchemy-sdd:execute-tasks 5

# Execute with extra retries
/agent-alchemy-sdd:execute-tasks 5 --retries 5

# Merge mode: updates pending, preserves completed
/agent-alchemy-sdd:create-tasks specs/SPEC-My-Feature.md

# Execute newly created/updated tasks
/agent-alchemy-sdd:execute-tasks --task-group my-feature

SDD Tools¶

The SDD Pipeline¶

Skills¶

/create-spec -- Adaptive Interview¶

Depth Levels¶

Six-Phase Workflow¶

Key Features¶

Output Format¶

/analyze-spec -- Quality Review¶

Analysis Categories¶

Severity Levels¶

Depth-Aware Analysis¶

Three Review Modes¶

Output Files¶

/create-tasks -- Spec-to-Task Decomposition¶

Eight-Phase Workflow¶

Task Granularity by Depth¶

Layer Decomposition Pattern¶

Task Metadata¶

Acceptance Criteria Categories¶

Merge Mode¶

/execute-tasks -- Wave-Based Execution¶

4-Phase Task Workflow¶

Verification Status Rules¶

Wave-Based Execution¶

Session Management¶

Key Behaviors¶

Dependency Inference¶

Layer-Based Dependencies¶

Phase Dependencies¶

Cross-Feature Dependencies¶

Keyword-Based Detection¶

Circular Dependency Handling¶

Agents¶

researcher¶

spec-analyzer¶

codebase-explorer¶

task-executor¶

Task Manager Integration¶

How It Works¶

What You See¶

Configuration¶

Available Settings¶

Example Settings File¶

Hooks¶

Directory Structure¶

Quick Reference¶

Common Workflows¶

`/create-spec` -- Adaptive Interview¶

`/analyze-spec` -- Quality Review¶

`/create-tasks` -- Spec-to-Task Decomposition¶

`/execute-tasks` -- Wave-Based Execution¶