archivebox.services.runnerο
Module Contentsο
Classesο
Functionsο
Dataο
APIο
- archivebox.services.runner._runner_console_line(*, crawl=None, crawl_id=None, snapshot=None, status: str = 'STARTED') None[source]ο
- archivebox.services.runner._count_selected_hooks(plugins: dict[str, abx_dl.models.Plugin], selected_plugins: list[str] | None) int[source]ο
- archivebox.services.runner._discover_archivebox_plugins() dict[str, abx_dl.models.Plugin][source]ο
- archivebox.services.runner._is_external_task_cancelled(error: asyncio.CancelledError) bool[source]ο
- async archivebox.services.runner._emit_machine_config(bus, *, config: dict[str, Any], derived_config: dict[str, Any], parent_event=None) None[source]ο
- archivebox.services.runner.ensure_background_runner(*, allow_under_pytest: bool = False) bool[source]ο
- class archivebox.services.runner.CrawlRunner(crawl, *, snapshot_ids: list[str] | None = None, selected_plugins: list[str] | None = None, process_discovered_snapshots_inline: bool = True, show_progress: bool = True, interactive_interrupts: bool = False, config_overrides: dict[str, Any] | None = None)[source]ο
Initialization
- async watch_for_cancelled_crawl(parent_event: abxbus.BaseEvent, *, poll_interval: float = 1.0) None[source]ο
- property allow_maintenance_on_inactive_crawl: bool[source]ο
Run the requested hooks on a snapshot whose parent crawl is paused or sealed.
Maintenance entry paths β direct
snapshot_ids + selected_pluginsinvocations for search backend backfill, fs migration, plugin-targeted updates β are legitimately allowed to operate on finished/paused crawls. Without this gate,crawl_is_cancelledwould treat a SEALED parent as a cancellation signal and short-circuit every guard before any hook ran, leaving the queued ArchiveResult rows stuck and the orchestrator looping on them.
- async enqueue_snapshot(snapshot_id: str, crawl_start_event: abx_dl.events.CrawlStartEvent | None = None) None[source]ο
- archivebox.services.runner.run_crawl(crawl_id: str, *, snapshot_ids: list[str] | None = None, selected_plugins: list[str] | None = None, process_discovered_snapshots_inline: bool = True, show_progress: bool = True, interactive_interrupts: bool = False, config_overrides: dict[str, Any] | None = None) None[source]ο
- archivebox.services.runner.queued_plugins_for_snapshot(snapshot_id: str) list[str] | None[source]ο
- archivebox.services.runner.run_snapshot_maintenance(snapshot_id: str, *, output_dir: pathlib.Path | None = None) bool[source]ο
- archivebox.services.runner.run_due_crawl(crawl, *, lock_seconds: int, interactive_interrupts: bool = False) bool[source]ο
- archivebox.services.runner.run_due_snapshot(snapshot, *, lock_seconds: int, interactive_interrupts: bool = False, runtime_config=None) bool[source]ο
- async archivebox.services.runner._run_install(plugin_names: list[str] | None = None) None[source]ο
- archivebox.services.runner._run_due_crawl_status(status: str, *, crawl_id: str | None, lock_seconds: int, interactive_interrupts: bool) bool[source]ο
- archivebox.services.runner._run_due_snapshot_query(queryset, *, lock_seconds: int, interactive_interrupts: bool, runtime_config) bool[source]ο
- archivebox.services.runner._run_due_snapshot_id(snapshot_id, *, lock_seconds: int, interactive_interrupts: bool, runtime_config) bool[source]ο
- archivebox.services.runner._run_due_queued_plugin_result(plugin_names: frozenset[str], *, crawl_id: str | None, lock_seconds: int, interactive_interrupts: bool, runtime_config) bool[source]ο