archivebox.plugins.discovery

Module Contents

Classes

ConfigLookup

PluginSpecialConfig

Functions

iter_plugin_dirs

Iterate over all built-in and user plugin directories.

get_plugins

Get list of available plugins by discovering plugin directories.

get_plugin_name

Get the base plugin name without numeric prefix.

get_enabled_plugins

Get the list of enabled plugins based on config and available hooks.

discover_plugins_that_provide_interface

Discover plugins that provide a specific Python module with required interface.

get_search_backends

Discover all available search backend plugins.

discover_plugin_configs

Discover all plugin config.json schemas.

get_plugin_special_config

Extract special config keys for a plugin following naming conventions.

get_plugin_template

Get a plugin template by plugin name and template type.

get_plugin_icon

Get the icon for a plugin from its icon.html template.

Data

BUILTIN_PLUGINS_DIR

USER_PLUGINS_DIR

DEFAULT_TEMPLATES

API

class archivebox.plugins.discovery.ConfigLookup[source]

Bases: typing.Protocol

get(key: str, default: Any = None) Any[source]
items() collections.abc.Iterable[tuple[str, Any]][source]
class archivebox.plugins.discovery.PluginSpecialConfig[source]

Bases: typing.TypedDict

enabled: bool[source]

None

timeout: int[source]

None

binary: str[source]

None

archivebox.plugins.discovery.BUILTIN_PLUGINS_DIR[source]

‘resolve(…)’

archivebox.plugins.discovery.USER_PLUGINS_DIR[source]

None

archivebox.plugins.discovery.iter_plugin_dirs() list[pathlib.Path][source]

Iterate over all built-in and user plugin directories.

archivebox.plugins.discovery.get_plugins() list[str][source]

Get list of available plugins by discovering plugin directories.

Returns plugin directory names for any plugin that exposes hooks, config.json, or a standardized templates/icon.html asset. This includes non-extractor plugins such as binary providers and shared base plugins.

archivebox.plugins.discovery.get_plugin_name(plugin: str) str[source]

Get the base plugin name without numeric prefix.

Examples: ‘10_title’ -> ‘title’ ‘26_readability’ -> ‘readability’ ‘50_parse_html_urls’ -> ‘parse_html_urls’

archivebox.plugins.discovery.get_enabled_plugins(config: archivebox.plugins.discovery.ConfigLookup | None = None, **config_kwargs: Any) list[str][source]

Get the list of enabled plugins based on config and available hooks.

Filters plugins by USE_/SAVE_ flags. Only returns plugins that are enabled.

archivebox.plugins.discovery.discover_plugins_that_provide_interface(module_name: str, required_attrs: list[str], plugin_prefix: str | None = None) dict[str, Any][source]

Discover plugins that provide a specific Python module with required interface.

This enables dynamic plugin discovery for features like search backends, storage backends, etc. without hardcoding imports.

archivebox.plugins.discovery.get_search_backends() dict[str, Any][source]

Discover all available search backend plugins.

Search backends must provide a search.py module with: - search(query: str) -> List[str] (returns snapshot IDs) - flush(snapshot_ids: Iterable[str]) -> None

archivebox.plugins.discovery.discover_plugin_configs() dict[str, dict[str, Any]][source]

Discover all plugin config.json schemas.

Each plugin can define a config.json file with JSONSchema defining its configuration options. This is intentionally cached because these schemas are plugin package metadata, not live user config; runtime values still come from env/db config at each callsite.

archivebox.plugins.discovery.get_plugin_special_config(plugin_name: str, config: archivebox.plugins.discovery.ConfigLookup, _visited: set[str] | None = None) archivebox.plugins.discovery.PluginSpecialConfig[source]

Extract special config keys for a plugin following naming conventions.

ArchiveBox recognizes 3 special config key patterns per plugin: - {PLUGIN}_ENABLED: Enable/disable toggle (default True) - {PLUGIN}_TIMEOUT: Plugin-specific timeout (fallback to TIMEOUT, default 300) - {PLUGIN}_BINARY: Primary binary path (default to plugin_name)

archivebox.plugins.discovery.DEFAULT_TEMPLATES[source]

None

archivebox.plugins.discovery.get_plugin_template(plugin: str, template_name: str, fallback: bool = True) str | None[source]

Get a plugin template by plugin name and template type.

Args: plugin: Plugin name (e.g., ‘screenshot’, ‘15_singlefile’) template_name: One of ‘icon’, ‘card’, ‘full’ fallback: If True, return default template if plugin template not found

archivebox.plugins.discovery.get_plugin_icon(plugin: str) str[source]

Get the icon for a plugin from its icon.html template.