Build a full product org with Claude agents — from business problem to deployed software.
Before the phases, the tools, and the agent definitions — this is the mental model that makes all of it work.
As you add agents under you, your job shifts from execution to direction. You're no longer the one writing the PRD, designing the architecture, or reviewing the PR — you're the one who defines the problem clearly enough that agents can do those things well. Agents multiply your output, but only as far as your intent is clear.
There's an old adage from the early days of computing: garbage in, garbage out. It's never been more relevant. The limiting factor in an agent-powered team isn't the technology — it's how precisely you can define the problem, the output, and the quality bar. A vague brief produces vague output, every time. The practical test: if you wouldn't hand this brief to a new hire and expect them to succeed without follow-up questions, the agent will struggle too.
Each agent is a Markdown file with a YAML frontmatter header defining its role, tools, model, and system prompt.
# Project-level agents .claude/ agents/ product-manager.md engineering-manager.md solution-architect.md product-designer.md business-analyst.md backend-engineer.md frontend-engineer.md qa-engineer.md devops-engineer.md security-reviewer.md contrarian-reviewer.md # ← new — see Phase 02B
--- name: product-manager description: Senior PM. Transforms business objectives into PRDs, user stories, and acceptance criteria. Invoke at project start and for scope changes. tools: Read, Write, WebSearch model: opus permissionMode: default --- You are a senior product manager. Given a business problem and objectives, you: 1. Write a structured PRD with goals, non-goals, user personas, and success metrics 2. Decompose into prioritized epics and user stories with acceptance criteria 3. Flag risks and dependencies before handoff 4. Output all artifacts to /specs/product/
Agent definitions are your org's engineering handbook and role descriptions. The more precise they are — including what paths they own, what they output, and what success looks like — the less drift you get during execution.
| Agent | Layer | Primary Output | Model |
|---|---|---|---|
| Product Manager | Spec & Strategy | PRD, user stories, acceptance criteria | Opus |
| Solution Architect | Spec & Strategy | Tech spec, API contracts, data models | Opus |
| Product Designer | Spec & Strategy | UX spec, component map, design brief | Sonnet |
| Business Analyst | Spec & Strategy | Requirements doc, test plan inputs | Sonnet |
| Eng. Manager | Spec & Strategy | Work breakdown, capacity plan | Sonnet |
| Backend Engineer | Build & Deploy | API code, services, migrations | Sonnet |
| Frontend Engineer | Build & Deploy | UI components, client logic | Sonnet |
| QA Engineer | Build & Deploy | Test cases, integration tests, bug reports | Sonnet |
| DevOps Engineer | Build & Deploy | CI/CD pipeline, infra config | Sonnet |
| Security Reviewer | Build & Deploy | Vulnerability report, fixes | Opus |
| Contrarian Reviewer | Both layers | Rejection rationale or approval | Opus |
Structure your pipeline as sequential handoffs with parallel execution branches. Each phase gates the next.
Business Problem │ ▼ PM Agent ──────────────────▶ Architect Agent │ ┌─────────────────────┼─────────────────────┐ ▼ ▼ ▼ Designer BA / QA Spec Infra Plan └─────────────────────┼─────────────────────┘ │ Eng. Team Lead ┌────────────┼────────────┐ ▼ ▼ ▼ Backend Frontend DevOps └────────────┼────────────┘ │ QA Agent │ Security Review │ ▶ Deploy
| Stage | Agent(s) | Inputs | Outputs |
|---|---|---|---|
| Discovery | PM Agent | Business problem doc | PRD, user stories |
| Architecture | Architect | PRD | Tech spec, API contracts, data models |
| Design | Designer + BA | PRD + Tech spec | UX spec, component map, test plan |
| Build | Backend + Frontend | All specs | Working code, unit tests |
| QA | QA Agent | Acceptance criteria | Integration tests, bug reports |
| Security | Security Reviewer | Codebase | Vulnerability report, fixes |
| Deploy | DevOps Agent | Tested build | CI/CD pipeline, live deployment |
The parent orchestrator only talks to a few top-level agents, keeping its own context clean. A Feature Lead agent receives a brief and decomposes it into sub-agents on its own. The parent never sees those details. This mirrors how real engineering orgs work — you don't have the VP of Engineering assigning tasks to individual engineers; you go through layers of leads.
Without a contrarian agent, you've assembled a team of yes-men. Every proposal gets endorsed — not because it's good, but because no one's job is to reject it.
A well-run research team always has someone whose job is to tear down weak hypotheses — not out of difficulty, but out of rigor. The same principle applies here. Add an agent whose express role is to pressure-test the work of the others before it moves to the next phase.
The contrarian's effectiveness depends on one rule: it must engage with the strongest version of the proposal, not the weakest. This is the difference between steel-manning and straw-manning. If the best possible version of an idea still fails scrutiny, it's a genuine non-starter.
--- name: contrarian-reviewer description: Adversarial reviewer. Invoke after any major artifact (PRD, tech spec, design, build) before phase handoff. Steel-mans first, then finds fatal flaws. tools: Read, Grep, WebSearch model: opus permissionMode: default --- You are an adversarial reviewer. Your job is not to agree. For each artifact you receive: 1. Steel-man it — present its strongest possible form, addressing obvious objections preemptively. 2. Attack that strongest version — find fatal flaws even in the best-case scenario. 3. Respond with one of: APPROVED: [reason it survives scrutiny] REJECTED: [numbered list of fatal flaws] Rules: - Never straw-man. Engage with the best version only. - "Unless you have a better idea" applies — rejection must include what would need to be true to approve. - Maximum 5 iterations before escalating to a human. - Output your review to /specs/reviews/
Agent Teams is Claude Code's experimental built-in feature for true parallel execution. One session leads; teammates work independently in their own context windows.
# Add to your environment or settings.json export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
Given the PRD at /specs/product/prd.md and tech spec at
/specs/technical/arch.md:
Spawn the following teammates:
- backend-engineer: owns /src/api/ and /src/services/
- frontend-engineer: owns /src/ui/ and /src/components/
- qa-engineer: owns /tests/ — writes tests from acceptance
criteria before code is written (TDD)
- devops-engineer: owns /infra/ and CI/CD pipeline
Coordinate through the shared task list. Backend and frontend
can work in parallel. QA writes tests first. DevOps unblocks
last. No teammate touches another's owned paths.If the terminal closes, active teammates are lost — there is currently no session resumption for in-process Agent Teams. For long-running tasks, run inside tmux, screen, or a cloud VM so you don't lose teammate state mid-sprint.
Your agents need to know exactly where to read from and write to. Structure your repo so there's no ambiguity about ownership or output location.
/specs/
product/
prd.md ← PM agent output
user-stories.md
technical/
architecture.md ← Architect agent output
api-contracts.md
data-models.md
design/
ux-spec.md ← Designer agent output
component-map.md
qa/
test-plan.md ← QA agent reads from acceptance criteria
test-cases.md
reviews/
contrarian-log.md ← Contrarian agent output
/CLAUDE.md ← Global context all agents load automaticallyEvery agent loads CLAUDE.md automatically at session start. It is the single most important file in the project. Its quality directly determines your output quality.
Have your QA agent write test cases from the acceptance criteria before engineers start coding. This gives engineers a deterministic success signal — green tests mean done. It also removes scope ambiguity and the "looks good to me" trap during review.
Agents can operate against your actual tooling — Linear, Slack, GitHub, Notion — through MCP server connections defined once in your project settings.
| Agent | MCP Tools | Actions |
|---|---|---|
| PM Agent | Linear, Notion | Creates epics and issues, writes PRDs to Notion |
| Architect | GitHub, Notion | Creates ADRs, opens architecture docs |
| QA Agent | Linear, Slack | Opens bug tickets, posts test results to Slack |
| DevOps Agent | GitHub, Slack | Triggers Actions, posts deploy status |
| Security Reviewer | Linear, Slack | Opens CVE tickets, alerts security channel |
| Contrarian Reviewer | Linear, Notion | Logs review decisions, blocks handoff tickets on rejection |
// .claude/settings.json { "mcpServers": { "linear": { "type": "url", "url": "https://mcp.linear.app/mcp" }, "slack": { "type": "url", "url": "https://mcp.slack.com/mcp" }, "github": { "type": "url", "url": "https://mcp.atlassian.com/v1/mcp" }, "notion": { "type": "url", "url": "https://mcp.notion.com/mcp" } } }
MCP servers defined in your project settings are automatically available to all agent teammates — no extra configuration per agent required.
Things that will determine your success before you write a single agent definition.
Your job is to set the stage, not to play every instrument. Spend your time on the brief, the quality bar, and the review checkpoint — not on the execution. The agents handle mechanics; you handle judgment.
These are the equivalent of your engineering handbook and role descriptions. Vague personas produce vague output. Specify owned paths, output formats, and success criteria before you run anything.
QA writes tests from acceptance criteria before engineers start. Green tests means done. This removes ambiguity and gives engineers a clear, verifiable target.
Don't run the full pipeline unattended at first. Gate on: (1) PRD approved, (2) architecture approved, (3) first working build. Tighten autonomy as you build trust in each agent's outputs.
Agent Teams use significantly more tokens than a single session. Start with 3–4 agents on a scoped feature before running the full org chart. Claude Max or Team plans are recommended for sustained pipeline runs.
Unlike single-session Claude Code, there are fewer chances to redirect mid-task. Ambiguous prompts at the start can cascade into hours of compute doing the wrong thing.
Agent Teams don't survive terminal closure. Use tmux, screen, or a cloud VM for long-running pipelines so you don't lose teammate state mid-sprint.
npm install -g @anthropic-ai/claude-code. Create your /specs/ directory structure and a CLAUDE.md.
product-manager.md, solution-architect.md, and backend-engineer.md. Keep definitions tight and specific.
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 to your environment.
contrarian-reviewer.md, then QA, frontend, and DevOps.
Claude Code's native Agent Teams is the right starting point. As you scale, the community has built orchestration layers worth exploring.
| Tool | Best For | Notes |
|---|---|---|
| Claude Code Agent Teams | Getting started, 3–8 agents | Native, no extra install, experimental |
| Multiclaude | Team usage with PR review gates | Go-based, multiplayer support |
| Gas Town | Solo devs, hobby projects | More complex, better for single-operator use |
| Ruflo / Claude Flow | Enterprise orchestration | 300+ MCP tools, self-learning routing |
| VS Code Multi-Agent | In-editor workflow | Claude + Codex + Copilot side-by-side |
The patterns that work today — context separation, shared task lists, peer-to-peer communication, TDD handoffs, and adversarial review — are foundational and will only grow more powerful as the tooling matures. Start small, run a real pilot, and build from there.