archivebox.search

Search module for ArchiveBox.

Search indexing is handled by search backend hooks in plugins: abx_plugins/plugins/search_backend_/on_Snapshot__index*.py

This module provides the query interface that dynamically discovers search backend plugins using the hooks system.

Search backends must provide a search.py module with: - search(query: str) -> List[str] (returns snapshot IDs) - flush(snapshot_ids: Iterable[str]) -> None

Submodules

Package Contents

Functions

get_default_search_mode

get_search_mode

prioritize_metadata_matches

get_available_backends

Discover all available search backend plugins.

get_backend

Get the configured search backend module.

query_search_index

Search for snapshots matching the query.

flush_search_index

Remove snapshots from the search index.

Data

_search_backends_cache

SEARCH_MODES

API

archivebox.search._search_backends_cache: dict | None[source]

None

archivebox.search.SEARCH_MODES[source]

(‘meta’, ‘contents’, ‘deep’)

archivebox.search.get_default_search_mode() str[source]
archivebox.search.get_search_mode(search_mode: str | None) str[source]
archivebox.search.prioritize_metadata_matches(base_queryset: django.db.models.QuerySet, metadata_queryset: django.db.models.QuerySet, fulltext_queryset: django.db.models.QuerySet, *, deep_queryset: django.db.models.QuerySet | None = None, ordering: list[str] | tuple[str, ...] | None = None) django.db.models.QuerySet[source]
archivebox.search.get_available_backends() dict[source]

Discover all available search backend plugins.

Uses the hooks system to find plugins with search.py modules. Results are cached after first call.

archivebox.search.get_backend() Any[source]

Get the configured search backend module.

Discovers available backends via the hooks system and returns the one matching SEARCH_BACKEND_ENGINE configuration.

Falls back to ‘ripgrep’ if configured backend is not found.

archivebox.search.query_search_index(query: str, search_mode: str | None = None) django.db.models.QuerySet[source]

Search for snapshots matching the query.

Returns a QuerySet of Snapshot objects matching the search.

archivebox.search.flush_search_index(snapshots: django.db.models.QuerySet) None[source]

Remove snapshots from the search index.