Skip to Content
Writing Tests
Best Practices

Best Practices

Writing effective tests is an iterative process. You won’t get it perfect on the first try—and that’s expected. This guide covers the techniques and patterns that lead to stable, reliable tests.

The Iteration Loop

Write your test

Start simple. Begin with one general step and goal (e.g., “complete the tutorial”) rather than adding complexity upfront.

Run and observe

Watch the agent’s behavior live or review the recording. Pay attention to where it fails or gets stuck.

Adjust and improve

Modify your test to address issues. Repeat until stable.

Run multiple instances of the same test after each change. This captures variance better and helps you iterate faster by reviewing several runs per iteration.

Use the Live View in Nexus to guide the agent through the test in real time. This helps you identify which instructions were needed. You can then update the test accordingly and often have a working version after just one iteration.


What to Adjust

When a test fails, work through these levers in order. Most issues can be fixed with the first three.

1. Update the Instructions

If the agent was doing something unintended or got sidetracked, clarify the instructions. Make it explicitly state what you want—just like you’d explain to a tester playing the game for the first time.

Be precise and include clear completion indicators:

BadProblemGood
Play until you have reached the first few levelsVague endpointPlay until you reach player level 3. The level is displayed top-left as “PL: X”.
The game loads within expected timeUndefined expectationThe game loads within 2-3 minutes
You should select the sword”Should” sounds optionalSelect the grey sword—NOT the blue one.
Play the gameNo end conditionPlay until you reach level 3
Complete the tutorialVague endpointFollow tutorial until ‘Tutorial Complete’ popup appears

Instruction patterns that work:

  • “Wait until [specific UI element] appears”
  • “Tap X, then tap Y, then tap Z”
  • “Repeat [action] until [condition]”

Handle variations with conditional language:

If the Terms of Service popup appears, tap "I Agree" (optional, can be skipped). If you see popup X, close it by tapping the X button.

Strong language and CAPS draw attention: “You MUST”, “CRITICAL”, “NEVER”, “ALWAYS”.

2. Update the Hints

Hints guide behavior. Once your instructions and expected results are clear, most improvements come from better hints.

Use hints for:

  • UI element locations — “The debug menu icon is in the top-right corner”
  • Workarounds — “If the button doesn’t respond, tap slightly to the left”
  • Skippable content — “The tutorial popup might not appear—skip if so”
  • References — “Consult the cheats-compendium for debug command syntax”

Two types of hints:

  • Agent-level hints — Apply to all tests. Teach general mechanics. Added in agent config.
  • Test-level hints — Apply to one test. For scenario-specific guidance.

If you use live chat to help the agent succeed, consider adding those instructions as permanent hints.

Hints do not persist between test steps. If the agent needs the same context later, repeat the hint or tell the agent to write it down.

3. Add Knowledge Files

For information that’s too detailed for hints, attach knowledge files. These are documents the agent can query during runtime.

Use knowledge for:

  • Feature specs to test against
  • Game design documents
  • Reference images for visual validation
  • Questionnaires or templates to fill out

Knowledge files can be scoped to project, agent, or individual test.

4. Add Game Functions (SDK)

For actions the agent struggles with repeatedly (navigating 3D environments, complex combos), give it a function to call.

Common use case: exposing existing cheat functions for faster testing (skip tutorials, add currency, unlock levels).

95%+ of issues can be fixed through natural language (steps 1-3). Game state and functions are for the remaining edge cases.


Writing Tips

Macro vs Micro Managing

Macro managing: Guide the agent with the full sequence of actions in a single step.

High-levelMacro-managed
Craft a PickaxeFirst gather 3 stones and 2 wood. Then walk to the workbench and open the crafting menu. Select and craft the pickaxe.
Reach level 3Open the build menu and repeatedly upgrade your home building until you reach level 3.

More detail = more consistency, but less flexibility. Find the right balance for your game.

The agent can follow several sentences or a short numbered list in one step, so avoid describing every tap unless the detail matters. For example:

“Click the shopping cart icon on the bottom left to open the store. After the store loads, click the javelin icon and buy it. Close any confirmation pop-ups.”

Keep tests under roughly 25 steps when possible. Longer tests are more likely to fail from accumulated small mistakes. Combine related actions into one clear step, or split a large flow into multiple focused Nexus tests.

Micro managing: When the agent still struggles, add explicit checks within the step.

“Complete today’s daily rewards quest. Repeatedly check the daily rewards menu to confirm completion.”


Troubleshooting

Review the Agent’s Thinking

The most important debugging step. In the recording view, examine what the agent was thinking:

  • Why it chose a particular action
  • How it determined success or failure

You’ll often spot misunderstandings immediately.

Information Priority

The agent prioritizes information by source:

  1. Instructions — Most important. Followed most strictly.
  2. Hints — Always considered, but less weight than instructions.
  3. Knowledge — Queried when the agent needs additional information.

If the agent consistently misses a critical point, move it into the instructions.

Emphasize Critical Instructions

Capitalize key terms:

  • Use the GREEN shovel.
  • Press the spin button TWICE.
  • Do NOT use any cheat commands.

Use strong keywords:

  • CRITICAL: Press the play button before creating a character.
  • ALWAYS use the pickaxe to mine stone.
  • When opening the door, YOU MUST do a backflip afterwards.

When Nothing Works

Some actions are too difficult for vision-based agents. In these cases:

  • Add a game function the agent can call
  • Use an existing cheat function (e.g., cheat_minigame(autoWin=True))

Still stuck? Contact us at team@nunu.ai—we’re happy to help brainstorm solutions.

Last updated on