Writing Good Tests

Writing good tests is a skill, and like any skill, it can be learned. At its core, writing tests is really about prompting: giving the agent clear, well-structured instructions that lead to consistent, reliable results.

Don’t expect perfection on the first try. Iteration is key. You’ll write a test, run it, see where the agent struggles or misinterprets your intent, then refine. Each iteration gets you closer to a test that works reliably every time.

Throughout this section, we’ll share the patterns, tips, and tricks we’ve learned from thousands of test runs. Whether you’re new to AI-driven testing or looking to level up, we’re here to help you write tests that actually work.

The Three Agent Types

There are three types of agents, each designed for different use cases. Choosing the right one depends on what you’re trying to achieve.

Agent	Best For	Setup Effort	Output
Verification	Regression & smoke testing	High (structured test cases)	Consistent, repeatable results
Discovery	Exploration & quick checks	Low (minimal or no instructions)	Broad coverage, bug discovery
Task	Workflow automation	Medium (task-specific prompts)	Reports, docs, automated workflows

Verification

Use verification tests when you need consistent, repeatable checks. These are structured test cases with defined steps, goals, and expected results. They take more effort upfront but pay off with stable results you can run daily or on every build.

→ Ideal for: regression testing, smoke testing, core flow validation

Discovery

Use discovery when you want to explore without strict structure. The Discovery Agent (Disco) adapts as it goes, finding bugs and issues you might not have thought to test for. Great for quick checks, new features, or just “does anything break?”

→ Ideal for: exploratory testing, validating fixes, testing new content

Task

Use the Task Agent when you need automation beyond testing. It’s a general-purpose agent for workflows like generating documentation, market research, compliance checks, or emulating specific user behaviors.

→ Ideal for: documentation, research, workflow automation