Running
This page covers running The Test Cabinet locally — on your own machine, for development or to exercise the whole flow end to end. Two shapes of “running” are worth separating, because they need very different amounts of setup:
- A single run, driven by the CLI (
tcab) or the Tauri desktop app. Both embed the core runner directly, so they need no backend or worker process — just a container runtime and a harness API key. This is the fastest way to launch one run; the quickstarts walk through it and Building covers producing the binaries. - The full service-driven flow — the backend,
a worker, and the
web console running as their own processes, exactly
as a deployed environment runs them, just all on
localhost. This is the environment to reach for when developing or debugging the services themselves, and it is what the rest of this page sets up.
Running the services on one machine is the local mirror of a real
deployment: the same binaries and the same configuration,
only bound to localhost. When you are ready to put them on real hosts — staging
and prod — see Deployment.
Prerequisites
Section titled “Prerequisites”- A container runtime (Docker or Podman) on the host — the worker needs it to execute runs. See Execution and first-time setup.
- The harness container images built or pullable for whichever harness you intend to run.
- The two service binaries, built per Building:
cargo build -p test-cabinet-backendandcargo build -p test-cabinet-worker(or thebuild-portable-*aliases for a static binary). The web console is a Vite app underapps/web. - A harness API key for the harness you will run (for example
ANTHROPIC_API_KEYforclaude).
Why the worker runs on the host
Section titled “Why the worker runs on the host”A natural instinct is to put everything in one docker compose stack. The
backend is happy in a container, but the worker starts a container per run,
so containerizing it means giving it access to the host’s container runtime
(bind-mounting the Docker socket) and ensuring the run’s
work directory is a path the host shares — the
nested run containers are started by the host’s daemon, so TCAB_WORK_DIR must
resolve to the same path on the host, not just inside the worker container.
That is the same caveat the
.env.worker.example
flags for macOS/Windows.
To keep the moving parts obvious, run the worker directly on the host and
only optionally containerize the backend. The
deployments/local/compose.yml
template brings the backend up in a container with a local volume for its state;
the worker stays a host process throughout.
1. Configure the services
Section titled “1. Configure the services”Copy the repo-root example env files and fill them in. These remain the authoritative list of every variable each service reads.
cp .env.backend.example .env.backendcp .env.worker.example .env.workerIn .env.backend, the only required value is the checkout the backend ingests
definitions from — point it at this repository:
TCAB_BACKEND_CHECKOUT=/absolute/path/to/the-test-cabinet# Leave TCAB_BACKEND_BIND at its default 127.0.0.1:8787 for local use.# Leave TCAB_BACKEND_DATABASE_URL unset to use the default local SQLite file.# R2 + deploy-hook variables can stay blank: with them unset the backend still# records to its database and regenerates the snapshot on disk (a dev-only mode).In .env.worker, point the worker at the local backend and provide the harness
key for whatever you will run:
TCAB_BACKEND_URL=http://127.0.0.1:8787# Leave TCAB_WORKER_BIND at its default 127.0.0.1:8788.ANTHROPIC_API_KEY=sk-ant-...2. Start the backend
Section titled “2. Start the backend”Either run the binary directly from a directory containing .env.backend:
./target/debug/tcab-backendor bring it up with the compose template, which mounts a local volume for the default SQLite database and the definition store so they survive a restart:
docker compose -f deployments/local/compose.yml up backendOnce it is up, ingest the repository so the catalog is populated:
curl -X POST http://127.0.0.1:8787/ingestConfirm it is serving with curl http://127.0.0.1:8787/healthz and
curl http://127.0.0.1:8787/test-cases.
3. Start the worker
Section titled “3. Start the worker”From a directory containing .env.worker, on the host:
./target/debug/tcab-workerIt reads TCAB_BACKEND_URL, resolves definitions from the backend you just
started, and binds 127.0.0.1:8788. Check curl http://127.0.0.1:8788/healthz —
the response reports the worker’s identity and the backend it is bound to, which
is a quick way to confirm the two agree.
4. Start the web console
Section titled “4. Start the web console”Run the console’s dev server and open it in a browser:
npm run dev -w @test-cabinet/webIn the UI, set the backend to http://127.0.0.1:8787 and add the worker at
http://127.0.0.1:8788. The console verifies the worker is bound to the same
backend before it will launch runs on it. From there you can launch a run on the
local worker, watch its event stream live, and review
the result exactly as you would against a remote environment.
Telemetry (optional)
Section titled “Telemetry (optional)”To watch traces across tcab-backend → tcab-worker locally, enable the
bundled Grafana LGTM stack and point each process at it. That is fully described
under Observability — note the
endpoint-duality rule:
a backend running inside the devcontainer uses http://lgtm:4318, while a worker
on the host uses http://localhost:4318. Leaving OTEL_EXPORTER_OTLP_ENDPOINT
unset keeps both on plain stdout logging.
When this works end to end, the same two binaries deploy unchanged to staging and prod on Azure — what changes is where they bind and how they are supervised, not how they are configured. See Deployment for the remote build.