> ## Documentation Index
> Fetch the complete documentation index at: https://momentic.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Stagehand

> How Momentic's multi-modal step cache, AI primitives, and managed runner compare against Stagehand and Browserbase.

Momentic is a managed testing platform for the web. Tests are YAML, executed on
a managed runner. A multi-modal step cache stores locator metadata per step and
auto-heals in place when the UI changes. AI primitives cover action, assertion,
visual diff, and typed extraction. AI providers route with cross-provider
failover behind a single managed surface. A dashboard captures run videos,
traces, network, heal events, and AI reasoning.

[Stagehand](https://docs.stagehand.dev/get_started/introduction) is an
open-source TypeScript library from Browserbase. It adds four AI primitives
(`act`, `observe`, `extract`, `agent`) on top of Playwright. With
`env: "BROWSERBASE"` it adds **Browserbase Cache** (server-side, on by default)
and **Browserbase Model Gateway** (one Browserbase key routes to OpenAI,
Anthropic, Google). It's well-suited to teams that want programmatic TypeScript
control with a thin AI layer over Playwright.

## Speed and caching

|                       | Momentic                                                                                                  | Stagehand                                                                                                                                                                  |
| --------------------- | --------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| What's cached         | Multi-modal locator data per step ([docs](/reliability/step-cache)).                                      | Resolved action per `act` call.                                                                                                                                            |
| DOM-change resilience | Unrelated DOM changes don't invalidate the entry.                                                         | Any structural change in the a11y tree (banner mount, modal open, streaming content) flips the hash. Docs recommend `waitForLoadState("networkidle")` before every action. |
| Heal on miss          | Re-resolves and **updates the entry in place**. Heal event on the run.                                    | Re-resolves and writes a **new entry** under the new key.                                                                                                                  |
| Storage               | Managed, git-aware.                                                                                       | Browserbase Cache (server-side, requires `env: "BROWSERBASE"`) or Local Cache (JSON in repo).                                                                              |
| Smart waiting         | Built-in: navigation, `load`, screenshots, DOM mutations, same-origin requests. 3s default, configurable. | Playwright actionability + manual `waitForLoadState` / `waitForResponse`.                                                                                                  |

## How the multi-modal cache works

A cached step stores more than one way to find the target: where it sits on
screen, what it looks like, what text it contains, and the structural and
accessibility attributes around it. Which of those signals matters for a given
step is inferred from the natural-language description. "The red Cancel button
below the Order Summary header" leans on visual and positional signals; "the
Submit button in the form" leans on structure and role. When a step replays, the
runner checks the stored signals against the live page and runs the action
without invoking the LLM when there's a match.

On a miss, the locator agent ([auto-heal](/reliability/auto-heal)) re-resolves
the original description against the live page, updates the cache entry in
place, and the run continues. A heal event is recorded against the run.

Stagehand's Browserbase Cache keys on the page's accessibility tree, so any
background change (a transient banner, a streaming widget, an A/B variant) flips
the key even when the target itself hasn't changed. The
[Stagehand caching docs](https://docs.stagehand.dev/best-practices/caching)
recommend `page.waitForLoadState("networkidle")` before every action to keep the
tree stable enough to hit. On miss, Stagehand re-runs the LLM and writes a
**new** entry under the new key; the old entry isn't reused. Local Cache writes
a JSON file under `cacheDir`; the team owns expiry, ignore rules, and lock
contention.

## AI primitives and assertions

|                   | Momentic                                                                                                                                  | Stagehand                                                                                                                                                                    |
| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Primitives        | 40+ step types: `act`, `assert`, `extract`, `assertVisually`, drag-and-drop, file upload, hover, `<select>`, native dialogs, scroll, etc. | Four primitives: `act`, `observe`, `extract`, `agent`.                                                                                                                       |
| Asserts           | `assert` is a first-class step type that fails by default.                                                                                | No native assert. Drops to Playwright `expect`, or builds from `observe()` / `extract()` + custom code.                                                                      |
| Visual regression | `assertVisually`, agent-scored against a golden.                                                                                          | Drops to Playwright `toHaveScreenshot()` (pixel / hash diff).                                                                                                                |
| Managed AI        | Cross-provider failover handled by the platform.                                                                                          | Browserbase Model Gateway: one key routes to OpenAI / Anthropic / Google with retries and backoff. Customer picks the model per call; no advertised cross-provider failover. |

<Accordion title="Technical details">
  **Momentic step types** (see [test format](/core-concepts/test-format))

  * Action: `act`, `click`, `type`, `hover`, `scroll`, `navigate`, `dragAndDrop`,
    `fileUpload`, `select`, `dialog`, `refresh`
  * Assert: `assert`, `assertVisually`, `checkPageContains`, `checkElement<...>`
  * Extract: `extract` (typed via JSON schema)
  * Control flow: `if/then`, modules, parameter inputs

  **Stagehand assertion pattern, for contrast**

  ```ts theme={null}
  const { ok } = await page.extract({
    instruction: "Is the dashboard chart visible and not cut off?",
    schema: z.object({ ok: z.boolean() }),
  });
  expect(ok).toBe(true);
  ```

  There's no failure mode beyond a boolean. The team owns thresholds, schema
  design, and reporter integration.
</Accordion>

## CI, recovery, and observability

|            | Momentic                                                                                                                                   | Stagehand                                        |
| ---------- | ------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------ |
| Runner     | Built-in CLI (`momentic run`).                                                                                                             | Bring-your-own (Vitest, Mocha, Playwright Test). |
| Reporters  | `junit`, `allure`, `playwright-json`, `buildkite-json`.                                                                                    | Whatever the host runner emits.                  |
| Sharding   | `--shard-index <i>` / `--shard-count <n>` (1-indexed). Deterministic alphabetical partition.                                               | Owned by the host runner.                        |
| Quarantine | First-class: tests run, results report, exit code unaffected unless `--only-quarantined`.                                                  | Owned by the team.                               |
| Recovery   | Post-run heal agent (`momentic ai heal`) rewrites failing tests and opens a PR or patch in CI; `momentic ai classify` triages the failure. | None.                                            |
| Dashboard  | Run videos, traces, heal events, AI reasoning, screenshots, network.                                                                       | Browserbase session replays.                     |

<Accordion title="Quarantine semantics">
  * Default: quarantined tests run, results report, statuses **do not** affect
    exit code.
  * `--skip-quarantined`: quarantined tests are skipped entirely.
  * `--only-quarantined`: only quarantined tests run; statuses **do** affect exit
    code.
</Accordion>

## Authoring side-by-side

Same flow, both tools:

```ts theme={null}
// Stagehand
import { Stagehand } from "@browserbasehq/stagehand";

const stagehand = new Stagehand({ env: "BROWSERBASE" });
await stagehand.init();
const { page } = stagehand;

await page.goto("https://app.example.com");
await page.act("Sign in with ada@example.com / secret");
const { ok } = await page.extract({
  instruction: "Is the dashboard chart visible and not cut off?",
  schema: z.object({ ok: z.boolean() }),
});
expect(ok).toBe(true);
```

Agentic simplified format:

```yaml theme={null}
fileType: momentic/test/v2
id: sign-in-and-verify
url: https://app.example.com
steps:
  - act: Sign in with ada@example.com / secret
  - assert: The dashboard chart is visible and not cut off
```

Explicit simplified format (same flow, step-by-step):

```yaml theme={null}
fileType: momentic/test/v2
id: sign-in-and-verify
url: https://app.example.com
steps:
  - type:
      text: ada@example.com
      into: Email
  - type:
      text: secret
      into: Password
  - click: Sign in
  - assert: The dashboard chart is visible and not cut off
```

## A more realistic test

The hello-world above doesn't exercise the full simplified format surface. A
representative checkout regression with module reuse, parameter inputs, typed
extraction, and a conditional looks like this:

```yaml checkout.test.yaml theme={null}
fileType: momentic/test/v2
id: checkout-with-promo
url: https://shop.example.com
steps:
  - module:
      path: ../modules/sign-in.module.yaml
      inputs:
        EMAIL: env.QA_EMAIL
        PASSWORD: env.QA_PASSWORD
  - act: Add the Tetris Eye Sweatshirt (size M) to the cart
  - navigate: https://shop.example.com/checkout
  - type:
      text: "{{ env.PROMO_CODE }}"
      into: Promo code field
  - click: Apply
  - if:
      assert: A success banner saying the promo was applied is visible
      then:
        - extract:
            goal: The discounted subtotal in the order summary
            schema:
              type: object
              properties:
                amount:
                  type: number
              required: [amount]
  - if:
      assert: An invalid-promo error is visible
      then:
        - assert: The subtotal is unchanged
  - assertVisually: The order summary section is fully visible and not cut off
```

The matching module:

```yaml ../modules/sign-in.module.yaml theme={null}
fileType: momentic/module/v2
id: sign-in
name: Sign in
parameters:
  - name: EMAIL
  - name: PASSWORD
steps:
  - type:
      text: "{{ env.EMAIL }}"
      into: Email
  - type:
      text: "{{ env.PASSWORD }}"
      into: Password
  - click: Sign in
  - assert: The dashboard chart is visible and not cut off
```

<Accordion title="Technical details, MCP integration">
  Both ship MCP servers for coding agents. The integration shape is different.

  Momentic's MCP server runs the agent as a preview-then-commit loop: the agent
  calls `momentic_preview_step` to verify a candidate against the live page, then
  `momentic_run_step` to commit. Authoring tools let the agent edit a slice of a
  saved test rather than rewriting the whole file. See the
  [MCP server docs](/integrations/mcp-server) for the full tool surface.

  Stagehand MCP exposes `act`, `observe`, `extract`, `agent` against a live page.
  The agent runs the flow once from memory and either keeps the snippet or
  rewrites it. There's no per-step preview / commit loop.
</Accordion>

## When to pick which

**Stagehand is the right call if** you already have a TypeScript Playwright test
suite you want to keep, you want programmatic control of every step, you're
already on Browserbase, and your AI use cases are narrow (one or two `act`s
sprinkled into mostly-deterministic Playwright code).

**Momentic is the right call if** wall-clock run time matters at scale, you need
AI assertions and visual diffs as first-class primitives, selector maintenance
is a real recurring cost, and you expect healing, recovery, quarantine, and a
managed dashboard built in rather than as bring-your-own components.

For the build-it-yourself version of this decision, see
[Build vs. buy](/comparisons/build-vs-buy).
