# Template 07 — First Agent Launch Checklist

This is the pre-flight checklist you run before deploying your first agent to real users or automating it for real workloads. It covers the five phases from initial definition through live deployment: Define, Configure, Test, Deploy, and Observe.

Work through the phases in order. Each phase builds on the one before it. If you find a gap in an earlier phase while working on a later one, stop, fix the earlier phase, and re-test before continuing.

This checklist assumes you have read chapters 1-14 of the guidebook and have your `ANTHROPIC_API_KEY` set.

---

## How to Use This Template

- Work through each item top to bottom, marking `[x]` when complete.
- For any item that references another template (T01–T06), use that template to complete the work before checking the box.
- The "Owner" column is for teams — fill in who is responsible for each item.
- Run this checklist again any time you make a significant change to your agent's system prompt, tools, or environment.

---

## Phase 1 — Define (What the agent is)

These items ensure your agent has a clear, complete definition before you touch any infrastructure.

| # | Item | Done | Owner |
|---|---|---|---|
| 1.1 | **Agent brief written** — You have completed Part 1 of Template 01: agent name, one-line purpose, target user, definition of "done," and hard constraints. | [ ] | [FILL IN] |
| 1.2 | **System prompt drafted** — You have a complete system prompt using the XML structure from Template 02: `<role>`, `<goals>`, `<constraints>`, `<tools_available>`, `<output_format>`, and at least one `<examples>` pair. | [ ] | [FILL IN] |
| 1.3 | **Model chosen** — You have selected a model using the rubric in Template 01. You are not using Opus by default. Your choice is documented. | [ ] | [FILL IN] |
| 1.4 | **Tools whitelist decided** — You have reviewed all 8 tools and made a deliberate yes/no decision for each one, using the whitelist pattern (`default_config: { enabled: false }` + explicit enables). | [ ] | [FILL IN] |
| 1.5 | **Memory decision made** — You have decided whether your agent needs memory. If yes, you have chosen one of the four patterns from Template 05 and planned your store and entry naming conventions. | [ ] | [FILL IN] |

---

## Phase 2 — Configure (The infrastructure)

These items cover environment setup, agent creation via the API, and resource provisioning.

| # | Item | Done | Owner |
|---|---|---|---|
| 2.1 | **Environment created** — You have run `ant beta:environments create` using the appropriate config from Template 03. You have saved the environment ID. | [ ] | [FILL IN] |
| 2.2 | **Networking scoped** — Your environment uses `limited` networking with an explicit `allowed_hosts` list if this agent runs in production. If using `unrestricted`, you have documented why it is acceptable. | [ ] | [FILL IN] |
| 2.3 | **Packages pre-installed** — Any pip, npm, or apt packages your agent needs are declared in the environment config's `packages` field (not installed at runtime via bash). | [ ] | [FILL IN] |
| 2.4 | **Agent created** — You have run `ant beta:agents create` with your system prompt, model, and tools whitelist. You have saved the agent ID and version number. | [ ] | [FILL IN] |
| 2.5 | **Memory stores provisioned** (if using memory) — You have created your memory store(s) via `client.beta.memory_stores.create()`, saved the store ID(s), and written any seed memories you want present before the first session. | [ ] | [FILL IN] |
| 2.6 | **MCP and Vaults configured** (if using MCP) — You have created a vault, added credentials, and confirmed the MCP server URL is declared in your agent's `mcp_servers` array. You have tested that `session.error` events related to MCP auth do not appear in a test session. | [ ] | [FILL IN] |
| 2.7 | **Files uploaded** (if using Files API) — Any static input files your agent needs are uploaded via `client.beta.files.upload()` and their file IDs are stored for use at session creation. | [ ] | [FILL IN] |

---

## Phase 3 — Test (Does it work?)

These items verify the agent behaves as expected before any real user or system sends it real work.

| # | Item | Done | Owner |
|---|---|---|---|
| 3.1 | **Console prototype tested** — You have tested your system prompt interactively in the Claude Console with at least 3 inputs: a typical case, an edge case, and a case that should trigger your error-handling instructions. | [ ] | [FILL IN] |
| 3.2 | **Happy-path session tested** — You have created a test session via the API (not the Console), sent a `user.message` event, and confirmed the `agent.message` response and the final output match your `<output_format>` specification. | [ ] | [FILL IN] |
| 3.3 | **Error-path tested** — You have tested at least one scenario where the agent should fail gracefully: missing input, ambiguous task, or tool returning an error. Confirmed the agent follows your error-handling instructions rather than hallucinating or hanging. | [ ] | [FILL IN] |
| 3.4 | **Tool calls verified** — You have confirmed via `agent.tool_use` events in the stream that the agent is calling the tools you expect (and only those tools). No tools outside your whitelist were invoked. | [ ] | [FILL IN] |
| 3.5 | **Permission policies tested** (if using `always_ask`) — You have confirmed that your event loop correctly handles `session.status_idle` with `stop_reason: requires_action`, sends `user.tool_confirmation` events, and the session resumes after confirmation. | [ ] | [FILL IN] |
| 3.6 | **Memory round-trip tested** (if using memory) — You have run two sessions back-to-back. Confirmed that information written to the memory store in session 1 (e.g., a user preference) is correctly read and used in session 2. | [ ] | [FILL IN] |
| 3.7 | **Output format validated** — If your agent writes files, you have confirmed the file is written to the correct path and the downstream consumer (your application code, a webhook, a human) can read and use it. | [ ] | [FILL IN] |

---

## Phase 4 — Deploy (Shipping it)

These items cover the transition from test to production.

| # | Item | Done | Owner |
|---|---|---|---|
| 4.1 | **Agent version pinned** (optional but recommended) — If consistency across sessions matters, you are creating sessions with a pinned agent version: `agent={"type": "agent", "id": agent.id, "version": N}` rather than the latest. | [ ] | [FILL IN] |
| 4.2 | **Session creation code reviewed** — Your session creation code passes the correct `agent`, `environment_id`, and `resources` for each use case. There are no hard-coded IDs that belong to your test environment. | [ ] | [FILL IN] |
| 4.3 | **API key secured** — Your `ANTHROPIC_API_KEY` is in an environment variable or secrets manager, not hard-coded in source files or committed to version control. | [ ] | [FILL IN] |
| 4.4 | **Rate limits reviewed** — You have checked the Managed Agents rate limits: 60 create requests/minute, 600 read requests/minute. If your workload could exceed these, you have a queuing strategy. | [ ] | [FILL IN] |
| 4.5 | **Session pricing estimated** — You have estimated your monthly session runtime cost at $0.08 per session-hour. For example: 10 sessions/day × 1 hour average × 30 days = 300 hours × $0.08 = $24/month in runtime alone, before token costs. This is within your budget. | [ ] | [FILL IN] |

---

## Phase 5 — Observe (Keeping it running)

These items ensure you can see what your agent is doing and respond when something goes wrong.

| # | Item | Done | Owner |
|---|---|---|---|
| 5.1 | **Event logging in place** — Your event loop logs at minimum: session ID, `agent.tool_use` (tool name), `session.error`, and `session.status_idle` stop reason. Logs are written somewhere you can query them. | [ ] | [FILL IN] |
| 5.2 | **`session.error` handled** — Your code handles `session.error` events (e.g., MCP auth failure) and either retries, alerts you, or fails gracefully — it does not silently continue. | [ ] | [FILL IN] |
| 5.3 | **Token usage tracked** — After each session goes `idle`, you retrieve `session.usage` (input_tokens, output_tokens, cache_read_input_tokens) and log or aggregate it for cost tracking. | [ ] | [FILL IN] |
| 5.4 | **Fallback for `terminated` sessions** — Your code detects when a session reaches `terminated` status (unrecoverable error) and has a defined response: retry with a new session, alert a human, or gracefully report failure to the user. | [ ] | [FILL IN] |
| 5.5 | **Agent update process defined** — You have a plan for how to update the agent's system prompt or tools without breaking in-flight sessions: update the agent (new version), continue using the old version for running sessions, start new sessions on the new version. | [ ] | [FILL IN] |

---

## Pre-Launch Sign-Off

Before going live, confirm each of the following:

```
[ ] All Phase 1 items complete
[ ] All Phase 2 items complete
[ ] All Phase 3 items complete (minimum: 3.1, 3.2, 3.3, 3.4)
[ ] All Phase 4 items complete
[ ] Phase 5 observability is in place before, not after, going live

Signed off by: [FILL IN: name]
Date: [FILL IN: date]
Agent ID: [FILL IN: agent_xxxxx]
Agent version: [FILL IN: version number]
Environment ID: [FILL IN: env_xxxxx]
```

---

## Worked Example — First Launch of a Newsletter Digest Agent

*Fictional business: Oaktree Media, a solo content creator. Founder Leila is deploying a weekly newsletter digest agent that reads a folder of saved articles and writes a 500-word briefing every Monday at 8 AM.*

### Phase 1 — Define

- 1.1 ✅ Brief: "NewsDigest Agent. Reads 5-10 saved Markdown articles from /workspace/inbox/, writes a 500-word newsletter briefing. Leila reviews before sending. Must never editorialize — only summarize."
- 1.2 ✅ System prompt uses Template 02 XML structure; output format specifies `/workspace/digest.md` with a fixed structure.
- 1.3 ✅ Model: `claude-haiku-4-5-20251001`. Task is summarization with a fixed schema — Haiku is sufficient and 5x cheaper.
- 1.4 ✅ Tools: `read` (yes), `write` (yes), `glob` (yes to list inbox files), `grep` (no), `bash` (no), `web_fetch` (no), `web_search` (no), `edit` (no).
- 1.5 ✅ No memory needed — Leila provides the articles each week as files; no cross-session continuity is required.

### Phase 2 — Configure

- 2.1 ✅ Environment: Config A (Safe Read-Only Research Agent, limited networking, no allowed hosts needed since no web access).
- 2.2 ✅ Networking: `limited`, empty `allowed_hosts` — agent has no business on the web.
- 2.3 ✅ No special packages needed.
- 2.4 ✅ Agent created with Haiku model, system prompt, and whitelist. Saved agent ID: `agent_01Hx...`
- 2.5 ✅ No memory stores.
- 2.6 ✅ No MCP.
- 2.7 ✅ Articles are uploaded via Files API each Monday before the cron job runs. Session mounts them under `/workspace/inbox/`.

### Phase 3 — Test

- 3.1 ✅ Tested in Console with 3 sample article sets.
- 3.2 ✅ API session test confirmed digest.md produced with correct structure.
- 3.3 ✅ Tested with 0 articles (empty inbox) — agent correctly writes "No articles found this week" rather than hallucinating.
- 3.4 ✅ Confirmed only `read`, `write`, `glob` events in stream. No `bash` or web tool calls.
- 3.5 ✅ N/A — all tools set to `always_allow` (read-only tools; no dangerous operations).
- 3.6 ✅ N/A — no memory.
- 3.7 ✅ Leila's downstream script reads `/workspace/digest.md` and pastes it into Mailchimp. Confirmed format compatibility.

### Phase 4 — Deploy

- 4.1 ✅ Version pinned to version 1.
- 4.2 ✅ Cron job code reviewed; uses production agent ID and environment ID.
- 4.3 ✅ API key in environment variable `ANTHROPIC_API_KEY` on the cron server.
- 4.4 ✅ One session per week — far below rate limits.
- 4.5 ✅ Estimated 30 min per session × 52 sessions/year = 26 hours = ~$2.08 in runtime/year. Within budget.

### Phase 5 — Observe

- 5.1 ✅ Event log written to `~/logs/newsdigest_sessions.jsonl`.
- 5.2 ✅ `session.error` alerts Leila via email (simple Python `smtplib` call in error handler).
- 5.3 ✅ Token usage logged after each session; Leila reviews monthly.
- 5.4 ✅ `terminated` status triggers email alert and does not attempt to send the digest that week.
- 5.5 ✅ Leila's update process: test new system prompt in Console, create new agent version, update cron job to pin the new version number, monitor for one cycle.
