App UI Testing (any URL — no deploy)

OpenFactory can drive a managed desktop tester VM (a vanilla Ubuntu desktop with a browser) to test the user interface of any web app over MCP — clicking, typing, and visually verifying like a person would.

You point it at a URL. You do not deploy your app with OpenFactory to test it. The tester VM opens whatever URL you give it — a local dev server, a preview or production deployment on Vercel, AWS, Netlify, Render, Fly, or any host, or any public site — as long as that URL is reachable from the VM. Your app keeps running wherever it already runs.

Want OpenFactory to host the app too? You can also deploy a Git repo and get a public preview URL to test against — see App Deployment.

This is for GUI / UX workflow testing (login flows, forms, navigation, “does the new button actually work”). Lower-level checks (image contents, packages, systemd units) belong in build test suites and assertions.

How it works

Get a tester VM — ensure_tester_vm returns your persistent, reusable desktop VM (created on first use, reused after).
Describe the flow once — create_app_scenario stores a reusable scenario: an ordered list of plain-language steps against your app’s URL.
Run it (and re-run it) — run_app_scenario executes the scenario in the VM and records screenshots + a pass/fail verdict.

Reusable, self-hardening scenarios

A scenario is described in plain language, not pixel coordinates or CSS selectors:


[
  { "action": "open_url", "value": "${APP_URL}" },
  { "action": "type", "target": "email field", "value": "${EMAIL}" },
  { "action": "type", "target": "password field", "value": "${PASSWORD}" },
  { "action": "click", "target": "Sign In button" },
  { "action": "type", "target": "verification code", "value": "${totp:OTP_SECRET}" },
  { "action": "click", "target": "Verify button", "expect": "Dashboard" }
]

Step actions: open_url, click, type, key, assert_text, wait. For click / type, target is matched on screen by the visual model — so it keeps working when markup changes.

Hardening (fast, resilient re-runs). The first run resolves each element with the visual model (slower) and remembers where it was. Later runs replay from that memory and skip the expensive full-screen analysis — re-resolving (and re-learning) only the steps whose UI actually moved or changed. So a suite gets faster on the second run and self-heals small UI changes instead of breaking.

Environment variables and secrets

Steps reference variables as ${VAR}. Put non-secret defaults (app URL, test email) on the scenario; pass secrets (passwords, tokens) at run time — they are used for substitution and are never stored or written into the recorded run.

Two-factor authentication (2FA)

If signing in requires a TOTP code, use ${totp:VAR} where VAR holds the account’s base32 TOTP secret (the same seed your authenticator app uses). OpenFactory computes the current 6-digit code (RFC 6238) and types it. Provide the seed as a run-time secret, never in the stored scenario.

Highlighting UI elements

annotate_screenshot draws labeled boxes on a screenshot — useful for a coding agent to box the element it just built (e.g. “new: Submit button”) for review or evidence. Coordinates are pixel-space, so a box drawn at an element’s reported position lands exactly on it.

Ad-hoc recorded runs

If you’d rather drive the VM step by step yourself (instead of a stored scenario), use start_app_test → record_app_test_step → finish_app_test. Each run is saved with per-step screenshots and a standalone HTML report.

MCP tools

Tool	Use
`ensure_tester_vm`	Get or create your persistent desktop tester VM
`create_app_scenario`	Save a reusable GUI test scenario for an app URL
`list_app_scenarios` / `get_app_scenario`	Browse a scenario and its hardened cache
`run_app_scenario`	Run a scenario (pass run-time secrets here) and record the result
`start_app_test` / `record_app_test_step` / `finish_app_test`	Drive and record an ad-hoc run yourself
`list_app_test_runs` / `get_app_test_run`	Review run history, screenshots, and reports
`annotate_screenshot`	Draw labeled highlight boxes on a screenshot
`desktop_screenshot` / `desktop_click` / `desktop_type` / …	Drive the VM directly

Example prompts


Create a smoke-login scenario for my app at https://my-app.vercel.app: open it,
sign in with the email/password I'll provide at run time, handle the 2FA code,
and verify the dashboard loads. Then run it.


Run the smoke-login scenario again and tell me which steps were served from
cache vs. re-resolved.


Open https://staging.example.com in the tester VM, screenshot it, and box the
primary call-to-action button labeled "new CTA".