Project № III · Case study

Zurf — a web for agents, not for humans

A CLI toolkit that gives AI agents a single, legible interface to the web — search, browse, ask, transcribe. Built because the tools humans use to read the internet are the wrong shape for something that reads a hundred pages in a minute.

I.Problem

Agents keep being handed a browser designed for a person.

The web an agent sees through a naive fetch is a different web from the one a person sees through Chrome. JavaScript-heavy pages return empty shells; search engines return HTML meant for a human's eye; video platforms return player chrome and nothing else. The workaround, in most agent codebases, is a small pile of bespoke scrapers that each fail in their own interesting way.

I wanted one surface. A single CLI an agent could reach for whenever it needed to know something about the outside world — and a single JSON shape coming back, regardless of whether the answer was found in a search result, a rendered page, or a video transcript.

The goal was not to build a better browser. It was to stop asking agents to pretend they were people.

II.Approach

A handful of verbs, markdown by default, three providers behind the curtain.

Zurf exposes a small vocabulary — search, fetch, browse, ask, transcribe — and hides the provider choice behind each. The default output is markdown, because markdown is what a language model reads cheaply; --json is there when an agent wants to consume the result programmatically.

  • Search returns URLs, titles, and snippets in a normalized list, with quick and deep modes mapped onto Perplexity's Sonar tiers.
  • Fetch is the lightweight path for static pages — capped at a megabyte, no browser, no rendering tax.
  • Browse is the heavy path, renders JavaScript-heavy pages through a Browserbase session, and flattens the result.
  • Ask routes through Sonar for questions that want citations rather than raw pages.
  • Transcribe pulls transcripts from YouTube, TikTok, Instagram, X, and Facebook through Supadata — native captions first, AI fallback second.

A small skill layer teaches coding agents how to use the commands progressively, so an agent does not have to hold the full CLI surface in its context to do useful work. The hardest part was not the integrations; it was deciding what not to ship — resisting every provider flag and every optional header. The useful surface turned out to be small.

III.Outcome

A tool I reach for before I reach for curl.

5
Verbs, one output shape
5
Video platforms transcribable
1
CLI replacing a folder of scrapers

Zurf is now the first thing the agents I work with use when they need to touch the web. The uniformity of the output means the prompt that handles a search result handles a rendered page handles a video transcript — the agent stops caring where the text came from and starts caring what it says.

The side effect I did not expect was on my own workflow. Having a single command for read the web turned out to be useful for humans too, and a non-trivial amount of my own research now goes through it.

IV.Retrospective

The right abstraction is the boring one.

The first drafts of Zurf tried to be clever — a unified query command that guessed whether the user wanted search, a page, or an answer. It was a bad idea dressed as a good one. Agents do better with verbs they can choose deliberately than with a single magic endpoint that sometimes does the wrong thing for reasons they cannot inspect.

The second lesson was about output. Returning markdown for browse felt unambitious next to returning a structured DOM; in practice, markdown is what the model wants, and the structured alternatives were a cost the agent paid for no benefit. Meet the consumer where it lives.

Still adding providers. Still cutting flags.