archivebox.search.views

Module Contents

Functions

get_admin_search_cache_key

Build the cache key for one user and changelist URL.

get_public_search_cache_key

Build the cache key for one public search URL.

get_cached_admin_search_ids

Return streamed admin search IDs from Django cache.

get_cached_public_search_ids

Return streamed public search IDs from Django cache.

get_cached_public_search_state

Return streamed public search state from Django cache.

iter_url_search_prefixes

Yield URL prefixes that can use indexed startswith scans for common search input.

url_prefix_upper_bound

Return the exclusive upper bound for an indexed URL prefix range.

iter_url_prefix_search_ids

Yield IDs for one URL prefix using the URL index, then apply caller filters.

iter_meta_search_ids

Yield metadata search matches from a filtered Snapshot queryset.

normalize_search_result_id

Return a compact Snapshot ID string from a search provider result.

iter_filtered_search_result_ids

Yield provider IDs that still match the filtered queryset.

iter_search_result_ids

Yield filtered Snapshot IDs from the selected search provider.

snapshot_search_stream_response

Stream Snapshot search progress and cache matching IDs for a list view.

admin_snapshot_search_stream_view

Stream admin Snapshot search progress and cache matching IDs.

public_snapshot_search_stream_view

Stream public Snapshot search progress and cache matching IDs.

Data

SEARCH_RESULT_CACHE_TTL

URL_PREFIX_SEARCH_LIMIT

API

archivebox.search.views.SEARCH_RESULT_CACHE_TTL[source]

60

archivebox.search.views.URL_PREFIX_SEARCH_LIMIT[source]

500

archivebox.search.views.get_admin_search_cache_key(request, url: str | None = None) str[source]

Build the cache key for one user and changelist URL.

archivebox.search.views.get_public_search_cache_key(request, url: str | None = None) str[source]

Build the cache key for one public search URL.

archivebox.search.views.get_cached_admin_search_ids(request) list[str] | None[source]

Return streamed admin search IDs from Django cache.

archivebox.search.views.get_cached_public_search_ids(request) list[str] | None[source]

Return streamed public search IDs from Django cache.

archivebox.search.views.get_cached_public_search_state(request) dict | None[source]

Return streamed public search state from Django cache.

archivebox.search.views.iter_url_search_prefixes(query: str)[source]

Yield URL prefixes that can use indexed startswith scans for common search input.

archivebox.search.views.url_prefix_upper_bound(prefix: str) str[source]

Return the exclusive upper bound for an indexed URL prefix range.

archivebox.search.views.iter_url_prefix_search_ids(prefix: str, queryset)[source]

Yield IDs for one URL prefix using the URL index, then apply caller filters.

archivebox.search.views.iter_meta_search_ids(query, queryset)[source]

Yield metadata search matches from a filtered Snapshot queryset.

archivebox.search.views.normalize_search_result_id(snapshot_id) str | None[source]

Return a compact Snapshot ID string from a search provider result.

archivebox.search.views.iter_filtered_search_result_ids(iterator, queryset, *, flush_max_delay=0.05)[source]

Yield provider IDs that still match the filtered queryset.

This is the single intersection/dedupe path used for metadata and every search backend. It flushes by elapsed time so sparse providers stream rows as soon as IDs are found instead of waiting for a fixed batch size.

archivebox.search.views.iter_search_result_ids(query, base_queryset, *, search_mode, config)[source]

Yield filtered Snapshot IDs from the selected search provider.

archivebox.search.views.snapshot_search_stream_response(query, base_queryset, *, search_mode, config, cache_key, thread_name)[source]

Stream Snapshot search progress and cache matching IDs for a list view.

archivebox.search.views.admin_snapshot_search_stream_view(model_admin, request)[source]

Stream admin Snapshot search progress and cache matching IDs.

archivebox.search.views.public_snapshot_search_stream_view(request)[source]

Stream public Snapshot search progress and cache matching IDs.