archivebox.misc.logging_util

Module Contents

Classes

TimedProgress

Show a progress bar and measure elapsed time until .end() is called

Functions

progress_bar

show timer in the form of progress bar, with percentage and seconds remaining

log_cli_command

log_list_started

log_list_finished

log_removal_started

log_removal_finished

pretty_path

convert paths like …/ArchiveBox/archivebox/../output/abc into output/abc

printable_filesize

format_duration

Format duration in human-readable form.

truncate_url

Truncate URL to max_length, keeping domain and adding ellipsis.

log_worker_event

Log a worker event with structured metadata and indentation.

printable_folders

printable_config

printable_folder_status

printable_dependency_version

API

class archivebox.misc.logging_util.TimedProgress(seconds, prefix='', config=None, **config_kwargs)[source]

Show a progress bar and measure elapsed time until .end() is called

Initialization

end()[source]

immediately end progress, clear the progressbar line, and save end_ts

archivebox.misc.logging_util.progress_bar(seconds: int, prefix: str = '', ANSI: dict[str, str] = ANSI, config=None, **config_kwargs) None[source]

show timer in the form of progress bar, with percentage and seconds remaining

archivebox.misc.logging_util.log_cli_command(subcommand: str, subcommand_args: collections.abc.Iterable[str] = (), stdin: str | IO | None = None, pwd: str = '.')[source]
archivebox.misc.logging_util.log_list_started(filter_patterns: list[str] | None, filter_type: str)[source]
archivebox.misc.logging_util.log_list_finished(snapshots)[source]
archivebox.misc.logging_util.log_removal_started(snapshots, yes: bool)[source]
archivebox.misc.logging_util.log_removal_finished(remaining_links: int, removed_links: int)[source]
archivebox.misc.logging_util.pretty_path(path: pathlib.Path | str, pwd: pathlib.Path | str = CONSTANTS.DATA_DIR, color: bool = True) str[source]

convert paths like …/ArchiveBox/archivebox/../output/abc into output/abc

archivebox.misc.logging_util.printable_filesize(num_bytes: int | float) str[source]
archivebox.misc.logging_util.format_duration(seconds: float) str[source]

Format duration in human-readable form.

archivebox.misc.logging_util.truncate_url(url: str, max_length: int = 60) str[source]

Truncate URL to max_length, keeping domain and adding ellipsis.

archivebox.misc.logging_util.log_worker_event(worker_type: str, event: str, indent_level: int = 0, pid: int | None = None, worker_id: str | None = None, url: str | None = None, plugin: str | None = None, metadata: dict[str, Any] | None = None, error: Exception | None = None) None[source]

Log a worker event with structured metadata and indentation.

Args: worker_type: Type of worker (Orchestrator, CrawlWorker, SnapshotWorker) event: Event name (Starting, Completed, Failed, etc.) indent_level: Indentation level (0=Orchestrator, 1=CrawlWorker, 2=SnapshotWorker) pid: Process ID worker_id: Worker ID (UUID for workers) url: URL being processed (for SnapshotWorker) plugin: Plugin name (for hook processes) metadata: Dict of metadata to show in curly braces error: Exception if event is an error

archivebox.misc.logging_util.printable_folders(folders: dict[str, Optional[archivebox.core.models.Snapshot]], with_headers: bool = False) str[source]
archivebox.misc.logging_util.printable_config(config: dict, prefix: str = '') str[source]
archivebox.misc.logging_util.printable_folder_status(name: str, folder: dict) str[source]
archivebox.misc.logging_util.printable_dependency_version(name: str, dependency: dict) str[source]