archivebox.cli.archivebox_run
archivebox run [–daemon] [–crawl-id=…] [–snapshot-id=…] [–binary-id=…]
Unified command for processing queued work on the shared abx-dl bus.
Modes: - With stdin JSONL: Process piped records, exit when complete - Without stdin (TTY): Run the background runner in foreground until killed - –crawl-id: Run the crawl runner for a specific crawl only - –snapshot-id: Run a specific snapshot through its parent crawl - –binary-id: Emit a BinaryRequestEvent for a specific Binary row
Examples: # Run the background runner in foreground archivebox run
# Run as daemon (don't exit on idle)
archivebox run --daemon
# Process specific records (pipe any JSONL type, exits when done)
archivebox snapshot list --status=queued | archivebox run
archivebox archiveresult list --status=failed | archivebox run
archivebox crawl list --status=queued | archivebox run
# Mixed types work too
cat mixed_records.jsonl | archivebox run
# Run the crawl runner for a specific crawl
archivebox run --crawl-id=019b7e90-04d0-73ed-adec-aad9cfcd863e
# Run one snapshot from an existing crawl
archivebox run --snapshot-id=019b7e90-5a8e-712c-9877-2c70eebe80ad
# Run one queued binary install directly on the bus
archivebox run --binary-id=019b7e90-5a8e-712c-9877-2c70eebe80ad
Module Contents
Functions
Process JSONL records from stdin. |
|
Run the background runner loop. |
|
Process queued work. |
|
Data
API
- archivebox.cli.archivebox_run.process_stdin_records() int[source]
Process JSONL records from stdin.
Create-or-update behavior:
Records WITHOUT id: Create via Model.from_json(), then queue
Records WITH id: Lookup existing, re-queue for processing
Outputs JSONL of all processed records (for chaining).
Handles any record type: Crawl, Snapshot, ArchiveResult. Auto-cascades: Crawl → Snapshots → ArchiveResults.
Returns exit code (0 = success, 1 = error).
- archivebox.cli.archivebox_run.run_runner(daemon: bool = False) int[source]
Run the background runner loop.
Args: daemon: Run forever (don’t exit when idle)
Returns exit code (0 = success, 1 = error).
- archivebox.cli.archivebox_run.main(daemon: bool, crawl_id: str, snapshot_id: str, binary_id: str)[source]
Process queued work.
Modes:
No args + stdin piped: Process piped JSONL records
No args + TTY: Run the crawl runner for all work
–crawl-id: Run the crawl runner for that crawl only
–snapshot-id: Run one snapshot through its crawl only
–binary-id: Run one queued binary install directly on the bus