Agents That Film Their Own Work: The Security Read on shot-scraper video
Simon Willison's shot-scraper 1.10 lets coding agents record video "proof" of browser-driven work using Playwright's new screencast API — a convenience that quietly expands the credential and trust surface security teams need to govern.
Key Takeaways
- shot-scraper 1.10's new `video` command takes a `storyboard.yml` routine and uses Playwright's screencast API to record an agent's browser session as evidence of completed work.
- The underlying capability comes from Playwright 1.59's "agentic video receipts" feature, built explicitly so coding agents can produce video walkthroughs for human review after finishing a task.
- shot-scraper's authentication model — JSON cookie files fed into an automated routine — means agents are increasingly handling live session credentials, not just static code.
- Video output is treated as evidence, but it is itself an artifact generated by the same agent it's meant to verify, which is a governance gap worth designing around rather than assuming away.
Simon Willison's shot-scraper is a small, well-known Playwright wrapper for taking automated screenshots of web pages. In the 1.10 release he added a video command: feed it a storyboard.yml file describing a sequence of scenes — clicks, form fills, waits, viewport settings, even server startup — and it drives a real browser session and records the routine as a WebM or MP4 file. The schema is validated with Pydantic, and the demo storyboard he used to show it off was, notably, written entirely by GPT-5.5 xhigh running inside Codex Desktop.
Why this isn't just a dev-tooling curiosity
The capability shot-scraper builds on is Playwright 1.59's screencast API, which Microsoft shipped this April with a section explicitly titled agentic video receipts: "Coding agents can produce video evidence of their work. After completing a task, an agent can record a walkthrough video with rich annotations for human review." That framing matters. Playwright 1.59 is, by its own release notes, the first version designed around AI agents driving the browser end to end — not a human running tests, but an autonomous process operating a real browser against a real application and then narrating its own actions back to a human reviewer.
The credential and trust surface that comes with it
Two things stand out from a security perspective. First, shot-scraper video supports authenticating its routines via JSON cookie files, so an agent recording a demo of a login-gated feature needs a live session credential available to the automation. That's a small but real expansion of where session material lives and how it flows — particularly if storyboards and cookie files end up checked into a repo, attached to a CI job, or handed to an agent with broader filesystem or network access than the task strictly requires.
Second, and less obvious: the video itself is presented as evidence, but it's produced by the same agent whose work it's meant to verify. An agent that misrepresents what it built, glosses over a broken edge case, or — adversarially — is manipulated by a prompt injection in the page content it's recording, can produce a polished, convincing walkthrough that still doesn't reflect reality. That's not a flaw specific to shot-scraper; it's the general shape of the AI agent verification problem, and it's worth naming explicitly rather than letting a slick video substitute for actual review.
- Treat agent-held session cookies like any other credential — scoped, short-lived, and excluded from anything an agent might commit or upload alongside a video artifact.
- Don't let a generated video replace independent verification — it's a useful artifact for human review, not a substitute for tests, diffing, or a second reviewer checking the actual change.
- Watch what's on screen — an automated walkthrough of an authenticated app can capture PII, internal data, or tokens visible in the UI, which then sits in a video file with its own retention and access questions.
- Scope what the recording agent can reach — a process driving a real browser with real cookies is a process that can also be steered by anything injected into the pages it visits.
The bigger pattern
This sits squarely inside the trend organisations are already grappling with under frameworks like ISO 42001: agentic tooling is moving from generating text to taking real, credentialed actions against real systems, and producing its own evidence trail along the way. Tools like shot-scraper video are a genuine productivity win — visual proof that a feature works is far better than a changelog line — but the governance question is whether agent-produced evidence is being trusted at face value or checked the way any other unverified input would be.
Frequently Asked Questions
What does shot-scraper's new `video` command actually do?
It reads a `storyboard.yml` file describing a sequence of browser actions (clicks, fills, waits) and uses Playwright's screencast API to record that routine as a WebM or MP4 video, giving coding agents a way to produce visual demos of completed work.
Does this introduce a new security risk on its own?
Not a vulnerability in the traditional sense, but it does expand where session credentials live — shot-scraper supports authenticating recorded routines with JSON cookie files — and it normalises trusting agent-generated video as proof of work without independent verification.
Is this related to Playwright's own agent features?
Yes. The recording capability shot-scraper uses comes from Playwright 1.59's screencast API, which Microsoft describes as enabling "agentic video receipts" — coding agents recording walkthrough videos of their own work for human review.
Sources
- 1Have your agent record video demos of its work with shot-scraper video — Simon Willison
- 2Release v1.59.0 — Microsoft / Playwright
- 3Release notes — Playwright
- 4Have your agent record video demos of its work with shot-scraper video — Simon Willison