archivebox.cli.archivebox_run

archivebox run [–daemon] [–crawl-id=…] [–snapshot-id=…] [–binary-id=…]

Unified command for processing queued work on the shared abx-dl bus.

Modes: - With stdin JSONL: Process piped records, exit when complete - Without stdin (TTY): Run the background runner in foreground until killed - –crawl-id: Run the crawl runner for a specific crawl only - –snapshot-id: Run a specific snapshot through its parent crawl - –binary-id: Emit a BinaryRequestEvent for a specific Binary row

Examples: # Run the background runner in foreground archivebox run

# Run as daemon (don't exit on idle)
archivebox run --daemon

# Process specific records (pipe any JSONL type, exits when done)
archivebox snapshot list --status=queued | archivebox run
archivebox archiveresult list --status=failed | archivebox run
archivebox crawl list --status=queued | archivebox run

# Mixed types work too
cat mixed_records.jsonl | archivebox run

# Run the crawl runner for a specific crawl
archivebox run --crawl-id=019b7e90-04d0-73ed-adec-aad9cfcd863e

# Run one snapshot from an existing crawl
archivebox run --snapshot-id=019b7e90-5a8e-712c-9877-2c70eebe80ad

# Run one queued binary install directly on the bus
archivebox run --binary-id=019b7e90-5a8e-712c-9877-2c70eebe80ad

Module Contents

Functions

process_stdin_records

Process JSONL records from stdin.

run_runner

Run the background runner loop.

main

Process queued work.

run_snapshot_worker

Data

__command__

API

archivebox.cli.archivebox_run.__command__[source]

‘archivebox run’

archivebox.cli.archivebox_run.process_stdin_records() int[source]

Process JSONL records from stdin.

Create-or-update behavior:

  • Records WITHOUT id: Create via Model.from_json(), then queue

  • Records WITH id: Lookup existing, re-queue for processing

Outputs JSONL of all processed records (for chaining).

Handles any record type: Crawl, Snapshot, ArchiveResult. Auto-cascades: Crawl → Snapshots → ArchiveResults.

Returns exit code (0 = success, 1 = error).

archivebox.cli.archivebox_run.run_runner(daemon: bool = False) int[source]

Run the background runner loop.

Args: daemon: Run forever (don’t exit when idle)

Returns exit code (0 = success, 1 = error).

archivebox.cli.archivebox_run.main(daemon: bool, crawl_id: str, snapshot_id: str, binary_id: str)[source]

Process queued work.

Modes:

  • No args + stdin piped: Process piped JSONL records

  • No args + TTY: Run the crawl runner for all work

  • –crawl-id: Run the crawl runner for that crawl only

  • –snapshot-id: Run one snapshot through its crawl only

  • –binary-id: Run one queued binary install directly on the bus

archivebox.cli.archivebox_run.run_snapshot_worker(snapshot_id: str) int[source]