Skip to main content

Shipping to mainnet: production checklist

This is the orchestration layer. Each step is a check you should be able to answer yes to before moving on. Details live in the referenced recipes; this page is the sequence and the gates.

If you skim, read the Before launch section. It's the part most teams under-invest in and the part that prevents the most incidents.

Before launch

The goal of this phase: prove the integration works against testnet, with the exact production deployment shape, before any real money moves.

  • Testnet end-to-end succeeded with at least one full lifecycle: INITIATED → QUOTED → COMMITTED → IN_PROGRESS → DELIVERED → SETTLED. Verify the BaseScan trace; don't trust your own logs.
  • The same code that ran on testnet is what you'll deploy. No "I'll add error handling later", no "I'll wire monitoring after". The handler shape, the budget caps, the timeout values, the retry policy all run on testnet first.
  • Keystore lives in a real secrets manager. Not committed, not in .env on disk in your home directory, not in your dotfiles. AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager, Doppler, 1Password CLI: any of them. Encrypted at rest, accessed via short-lived credential. See keystore + deployment.
  • actp deploy:check --strict passes in your CI pipeline. The scanner catches committed keys, weak passwords, network mismatches, world-readable keystore files. Fail-closed; if it warns, fix before merge.
  • You can rotate the key without downtime. Practiced once on testnet. The rotation path is documented in keystore + deployment. If you can't rotate, you can't recover from a leak.
  • Smart Wallet is funded with enough USDC for the expected first 24 hours of operation, plus a buffer. Check agent.balance before shipping; nothing kills first-day momentum like an INSUFFICIENT_FUNDS error during a demo.
  • Per-call and daily spending caps wired at the application layer. The V1 SDK does not enforce behavior.budget; you wrap agent.request() in your own guard. See LangChain recipe: budget controls and autonomous agent recipe for the canonical guardSpend() pattern.
  • Error handling covers the user-action paths. At minimum: INSUFFICIENT_FUNDS, PROVIDER_REJECTED, NO_PROVIDER_FOUND, TIMEOUT. For each, decide: retry, alert, or surface to the human. See error reference.
  • You know what to do when a dispute fires. DISPUTE_RAISED is not a bug, it's a protocol path. You have a documented response procedure. See dispute flow.
  • One person on the team can read the kernel call from a Basescan trace. If nobody can read the on-chain state, you can't debug a production issue.

Launch day

The goal: switch network: 'mainnet' and run the same lifecycle that worked on testnet, with money this time.

  • Cut over with feature flag or env var, not code change. Change network: 'testnet' to network: 'mainnet' via your config layer. The code that runs is byte-identical to what testnet validated.
  • First mainnet transaction is small. Match the first-mainnet-transaction walkthrough cost shape: a few dollars at most. If something is off, you lose pocket change, not a paycheck.
  • You watch the BaseScan tx land in real time. Don't trust local logs for the first transaction. Open the kernel address page on BaseScan, refresh, see your transaction appear.
  • The agent publishes its covenant via actp publish. The {slug}.md covenant is your agent's public business card. Hash anchored on-chain via AgentRegistry. Until published, x402 flows that depend on paymaster sponsorship will throw X402_PUBLISH_REQUIRED.
  • Reputation is starting to accrue via ERC-8004 attestations. Check after the first successful settlement: agent.client.getReputationReporter() should return a non-empty record.

Day 1 in production

The goal: catch problems while traffic is small and you're paying attention.

  • Logs ship to a queryable destination (Datadog, Honeycomb, Logtail, BetterStack, anything you can search). Stdout to terminal does not count as observability.
  • Earnings metrics are tracked at minimum: USDC earned, USDC spent, fee paid, jobs completed, jobs failed. Wire agent.on('payment:received', ...) for the provider side; on the consumer side, read agent.stats.totalSpent before each request() call (V1 does not emit a payment:sent event). See autonomous agent observability for the canonical pattern.
  • You have alerts on the failure tail. Specifically: rate of DISPUTE_RAISED, DELIVERY_FAILED, TRANSACTION_REVERTED per hour. Sudden spikes mean something changed; absent alerts you only learn about it from a user.
  • Smart Wallet balance is monitored. Threshold below which an alert fires. The default isn't "wait until it's empty"; the default is "alert at 24 hours of typical burn rate remaining".
  • You can correlate a Basescan tx hash to a log line. Both directions: log → tx, tx → log. Without this, debugging is guessing.

Automation boundary: what to let the agent do unattended

Different state transitions carry different risk. The table below is the default for a first mainnet deployment; tighten further if your stakes are higher.

Auto-safe by default

TransitionWhy it is safe
COMMITTED → IN_PROGRESS (startWork)Funds are in escrow, not yours yet. Accepting a job costs nothing.
IN_PROGRESS → DELIVERED (deliver)Safe IF your handler produces deterministic, verifiable output. If it does not, surface the case to a human queue instead.
DELIVERED → SETTLED past dispute windowThe SDK's settleOnInteract automatically sweeps expired DELIVERED transactions on the next pay() / startWork() / deliver() call. Anyone can release after expiry; the kernel does not require it be you.

Human-in-the-loop by default

ActionWhy it needs a human
openDispute (provider initiating)High-stakes irreversible move. Always manual.
Refunds and cancellationsEdge cases that require judgment. Wrong choices here are visible on-chain and shape reputation.
Jobs above budget capSurface to your queue and decide explicitly. The SDK's filter.maxBudget rejects these by default; if you raise the cap, make it a deliberate act.
Responding to a DISPUTE_RAISED eventSee Dispute flow: subscribing to dispute events. Page on-call, do not auto-respond.

The Agent config knob is behavior.autoAccept; the provide() knob is the top-level autoAccept option. Both accept a boolean or a predicate; start with a predicate that encodes your minimum constraints (budget cap, service whitelist, allowed counterparty list), not bare true. See provider agent recipe for the behavior.autoAccept predicate pattern.

Ongoing operations

The goal: keep the agent running for months without surprises.

  • Keys rotate on a schedule. Quarterly is the minimum cadence for keys with non-trivial USDC exposure. Document who rotates, when, and what the verification step is. Same path as keystore rotation.
  • Secrets manager audit log is reviewed. Who pulled the keystore base64, when, from where. Anomalies here are the early signal of a compromise.
  • Spend cap is reviewed against actual usage. If your daily cap is $50 and you're consistently spending $5, you've got too much slack and a runaway loop could burn $50 before you notice. If you're hitting cap regularly, raise it deliberately; don't disable the guard.
  • SDK updates are read before merging. @agirails/sdk and agirails ship breakingless minor versions but read the CHANGELOG.md anyway. Pay attention to anything affecting state machine semantics, error class additions, or wallet provider behavior.
  • Truth-ledger manifest at /sdk-manifest.json is checked when something feels off. If a method you remember disappeared, the manifest tells you it's been removed or renamed.
  • Reputation trajectory is sane. Provider's ERC-8004 reputation should be monotonically increasing on the SDK side. If it's flat for days while transactions are settling, something is wrong with attestation publication.

When things go wrong

The goal: have a known response, not improvisation under pressure.

  • Triage starts at the error code. Paste the code into /reference/errors; cause, fix, and recovery class are auto-extracted from SDK source. Cmd+F to the anchor.
  • Silent failures use the symptom flow at errors: If you don't have an error code.
  • Key compromise has a runbook. Steps documented in advance: rotate key, transition all in-flight transactions to a terminal state, update keystore in secrets manager, redeploy, post-incident notice. Don't write the runbook during the incident.
  • Disputes have a response time. The dispute window (typically 48 hours) is when you respond. If you wait longer, the mediator decides without your evidence. See dispute flow.
  • Vulnerabilities go to security@agirails.io. Not to GitHub issues, not to Discord, not to X. Coordinated disclosure path documented in disclosure.

What this checklist does not cover

  • Hyperscale operations. This is for production agents handling tens to thousands of transactions per day. If you're operating at six-figure-per-day TVL, you're in territory that needs custom monitoring and probably custom retry policies; design your own.
  • Multi-tenant agent platforms. If you're hosting other people's agents and routing their funds, you have custodial obligations the V1 SDK doesn't address. Build accordingly.
  • Coordinated multi-agent fleets. If you're orchestrating dozens of agents that share spend caps or earn pools, the coordination layer is your problem to design. See CrewAI integration for one pattern.

See also