archivebox.crawls.admin

Module Contents

Classes

URLFiltersWidget

URLFiltersField

CrawlAdminForm

Custom form for Crawl admin to render urls field as textarea.

CrawlAdmin

CrawlScheduleAdmin

Functions

render_snapshots_list

Render a nice inline list view of snapshots with status, title, URL, and progress.

register_admin

API

archivebox.crawls.admin.render_snapshots_list(snapshots_qs, limit=20, crawl=None)[source]

Render a nice inline list view of snapshots with status, title, URL, and progress.

class archivebox.crawls.admin.URLFiltersWidget[source]

Bases: django.forms.Widget

render(name, value, attrs=None, renderer=None)[source]
value_from_datadict(data, files, name)[source]
class archivebox.crawls.admin.URLFiltersField[source]

Bases: django.forms.Field

widget[source]

None

to_python(value)[source]
class archivebox.crawls.admin.CrawlAdminForm(*args, **kwargs)[source]

Bases: django.forms.ModelForm

Custom form for Crawl admin to render urls field as textarea.

Initialization

tags_editor[source]

‘CharField(…)’

url_filters[source]

‘URLFiltersField(…)’

class Meta[source]
model[source]

None

fields[source]

‘all’

widgets[source]

None

clean_tags_editor()[source]
clean_url_filters()[source]
save(commit=True)[source]
class archivebox.crawls.admin.CrawlAdmin[source]

Bases: archivebox.base_models.admin.ConfigEditorMixin, archivebox.base_models.admin.BaseModelAdmin

form[source]

None

list_display[source]

(‘id’, ‘created_at’, ‘created_by’, ‘max_depth’, ‘max_urls’, ‘max_size’, ‘label’, ‘notes’, ‘urls_prev…

sort_fields[source]

(‘id’, ‘created_at’, ‘created_by’, ‘max_depth’, ‘max_urls’, ‘max_size’, ‘label’, ‘notes’, ‘schedule_…

search_fields[source]

(‘id’, ‘created_by__username’, ‘max_depth’, ‘max_urls’, ‘max_size’, ‘label’, ‘notes’, ‘schedule_id’,…

readonly_fields[source]

(‘created_at’, ‘modified_at’, ‘snapshots’)

fieldsets[source]

((‘URLs’,), (‘Info’,), (‘Settings’,), (‘Status’,), (‘Relations’,), (‘Timestamps’,), (‘Snapshots’,))

add_fieldsets[source]

((‘URLs’,), (‘Info’,), (‘Settings’,), (‘Status’,), (‘Relations’,))

list_filter[source]

(‘max_depth’, ‘max_urls’, ‘schedule’, ‘created_by’, ‘status’, ‘retry_at’)

ordering[source]

[‘-created_at’, ‘-retry_at’]

list_per_page[source]

100

actions[source]

[‘delete_selected_batched’]

change_actions[source]

[‘recrawl’]

get_queryset(request)[source]

Optimize queries with select_related and annotations.

get_fieldsets(request, obj=None)[source]
get_urls()[source]
delete_selected_batched(request, queryset)[source]

Delete crawls in a single transaction to avoid SQLite concurrency issues.

recrawl(request, obj)[source]

Duplicate this crawl as a new crawl with the same URLs and settings.

num_snapshots(obj)[source]
snapshots(obj)[source]
delete_snapshot_view(request: django.http.HttpRequest, object_id: str, snapshot_id: str)[source]
exclude_domain_view(request: django.http.HttpRequest, object_id: str, snapshot_id: str)[source]
schedule_str(obj)[source]
urls_preview(obj)[source]
health_display(obj)[source]
urls_editor(obj)[source]

Editor for crawl URLs.

class archivebox.crawls.admin.CrawlScheduleAdmin[source]

Bases: archivebox.base_models.admin.BaseModelAdmin

list_display[source]

(‘id’, ‘created_at’, ‘created_by’, ‘label’, ‘notes’, ‘template_str’, ‘crawls’, ‘num_crawls’, ‘num_sn…

sort_fields[source]

(‘id’, ‘created_at’, ‘created_by’, ‘label’, ‘notes’, ‘template_str’)

search_fields[source]

(‘id’, ‘created_by__username’, ‘label’, ‘notes’, ‘schedule_id’, ‘template_id’, ‘template__urls’)

readonly_fields[source]

(‘created_at’, ‘modified_at’, ‘crawls’, ‘snapshots’)

fieldsets[source]

((‘Schedule Info’,), (‘Configuration’,), (‘Metadata’,), (‘Crawls’,), (‘Snapshots’,))

list_filter[source]

(‘created_by’,)

ordering[source]

[‘-created_at’]

list_per_page[source]

100

actions[source]

[‘delete_selected’]

get_queryset(request)[source]
get_fieldsets(request, obj=None)[source]
save_model(request, obj, form, change)[source]
template_str(obj)[source]
num_crawls(obj)[source]
num_snapshots(obj)[source]
crawls(obj)[source]
snapshots(obj)[source]
archivebox.crawls.admin.register_admin(admin_site)[source]