archivebox.search.query

Module Contents

Functions

escape_like_query

Escape a string for SQLite LIKE matching.

crawl_config_values_search_wave

Build a Snapshot Q predicate matching values inside Crawl.config.

snapshot_metadata_search_waves

Build ordered metadata predicates for Snapshot search.

prioritize_metadata_matches

Rank metadata hits before backend full-text hits.

apply_snapshot_search

Apply shared CLI/API/public/admin Snapshot search semantics.

query_search_index

Return a Snapshot queryset from backend search IDs.

iter_query_search_ids

Yield snapshot IDs from configured search backend modules.

flush_search_index

Remove Snapshot IDs from the configured search backend index.

Data

MAX_SEARCH_RANK_IDS

API

archivebox.search.query.MAX_SEARCH_RANK_IDS[source]

500

archivebox.search.query.escape_like_query(query: str) str[source]

Escape a string for SQLite LIKE matching.

archivebox.search.query.crawl_config_values_search_wave(query: str) django.db.models.Q | None[source]

Build a Snapshot Q predicate matching values inside Crawl.config.

archivebox.search.query.snapshot_metadata_search_waves(query: str, *, include_id_matches: bool = False) list[django.db.models.Q][source]

Build ordered metadata predicates for Snapshot search.

archivebox.search.query.prioritize_metadata_matches(base_queryset: django.db.models.QuerySet, metadata_queryset: django.db.models.QuerySet, fulltext_queryset: django.db.models.QuerySet, *, deep_queryset: django.db.models.QuerySet | None = None, ordering: list[str] | tuple[str, ...] | None = None) django.db.models.QuerySet[source]

Rank metadata hits before backend full-text hits.

Apply shared CLI/API/public/admin Snapshot search semantics.

archivebox.search.query.query_search_index(query: str, search_mode: str | None = None, config: dict[str, Any] | None = None, max_results: int | None = None, **config_kwargs: Any) django.db.models.QuerySet[source]

Return a Snapshot queryset from backend search IDs.

archivebox.search.query.iter_query_search_ids(query: str, search_mode: str | None = None, config: dict[str, Any] | None = None, max_results: int | None = None, **config_kwargs: Any)[source]

Yield snapshot IDs from configured search backend modules.

archivebox.search.query.flush_search_index(snapshots: django.db.models.QuerySet, config: dict[str, Any] | None = None, **config_kwargs: Any) None[source]

Remove Snapshot IDs from the configured search backend index.