archivebox.cli.archivebox_snapshot

archivebox snapshot [args…] [–filters]

Manage Snapshot records.

Actions: create - Create Snapshots from URLs or Crawl JSONL list - List Snapshots as JSONL (with optional filters) update - Update Snapshots from stdin JSONL delete - Delete Snapshots from stdin JSONL

Examples: # Create archivebox snapshot create https://example.com –tag=news archivebox crawl create https://example.com | archivebox snapshot create

# List with filters
archivebox snapshot list --status=queued
archivebox snapshot list --url__icontains=example.com

# Update
archivebox snapshot list --tag=old | archivebox snapshot update --tag=new

# Delete
archivebox snapshot list --url__icontains=spam.com | archivebox snapshot delete --yes

Module Contents

Functions

create_snapshots

Create Snapshots from URLs or stdin JSONL (Crawl or Snapshot records). Pass-through: Records that are not Crawl/Snapshot/URL are output unchanged.

snapshot_filter_options

snapshot_output_options

build_snapshot_queryset

list_snapshots

List Snapshots as JSONL with optional filters.

update_snapshots

Update Snapshots from stdin JSONL.

delete_snapshots

Delete Snapshots from stdin JSONL.

main

Manage Snapshot records.

create_cmd

Create Snapshots from URLs or stdin JSONL.

list_cmd

List Snapshots as JSONL.

update_cmd

Update Snapshots from stdin JSONL.

delete_cmd

Delete Snapshots from stdin JSONL.

Data

__command__

SNAPSHOT_FILTER_TYPE_CHOICES

SNAPSHOT_LIST_CHUNK_SIZE

API

archivebox.cli.archivebox_snapshot.__command__[source]

‘archivebox snapshot’

archivebox.cli.archivebox_snapshot.SNAPSHOT_FILTER_TYPE_CHOICES[source]

(‘exact’, ‘substring’, ‘regex’, ‘domain’, ‘tag’, ‘timestamp’)

archivebox.cli.archivebox_snapshot.SNAPSHOT_LIST_CHUNK_SIZE[source]

100

archivebox.cli.archivebox_snapshot.create_snapshots(urls: collections.abc.Iterable[str], tag: str = '', status: str = 'queued', depth: int = 0, created_by_id: int | None = None) int[source]

Create Snapshots from URLs or stdin JSONL (Crawl or Snapshot records). Pass-through: Records that are not Crawl/Snapshot/URL are output unchanged.

Exit codes: 0: Success 1: Failure

archivebox.cli.archivebox_snapshot.snapshot_filter_options(*, default_filter_type: str)[source]
archivebox.cli.archivebox_snapshot.snapshot_output_options(func)[source]
archivebox.cli.archivebox_snapshot.build_snapshot_queryset(**kwargs) django.db.models.QuerySet[source]
archivebox.cli.archivebox_snapshot.list_snapshots(csv: str | None = None, as_json: bool = False, as_html: bool = False, with_headers: bool = False, **kwargs) int[source]

List Snapshots as JSONL with optional filters.

Exit codes: 0: Success (even if no results)

archivebox.cli.archivebox_snapshot.update_snapshots(status: str | None = None, tag: str | None = None) int[source]

Update Snapshots from stdin JSONL.

Reads Snapshot records from stdin and applies updates. Uses PATCH semantics - only specified fields are updated.

Exit codes: 0: Success 1: No input or error

archivebox.cli.archivebox_snapshot.delete_snapshots(yes: bool = False, dry_run: bool = False) int[source]

Delete Snapshots from stdin JSONL.

Requires –yes flag to confirm deletion.

Exit codes: 0: Success 1: No input or missing –yes flag

archivebox.cli.archivebox_snapshot.main()[source]

Manage Snapshot records.

archivebox.cli.archivebox_snapshot.create_cmd(urls: tuple, tag: str, status: str, depth: int)[source]

Create Snapshots from URLs or stdin JSONL.

archivebox.cli.archivebox_snapshot.list_cmd(**kwargs)[source]

List Snapshots as JSONL.

archivebox.cli.archivebox_snapshot.update_cmd(status: str | None, tag: str | None)[source]

Update Snapshots from stdin JSONL.

archivebox.cli.archivebox_snapshot.delete_cmd(yes: bool, dry_run: bool)[source]

Delete Snapshots from stdin JSONL.