archivebox.api.v1_crawls

Module Contents

Classes

CrawlSchema

CrawlUpdateSchema

CrawlCreateSchema

CrawlDeleteResponseSchema

Functions

normalize_tag_list

get_crawls

create_crawl

get_crawl

Get a specific Crawl by id.

patch_crawl

Update a crawl (e.g., set status=sealed to cancel queued work).

delete_crawl

Data

router

API

archivebox.api.v1_crawls.router[source]

β€˜Router(…)’

class archivebox.api.v1_crawls.CrawlSchema[source]

Bases: ninja.Schema

TYPE: str[source]

β€˜crawls.models.Crawl’

id: uuid.UUID[source]

None

modified_at: datetime.datetime[source]

None

created_at: datetime.datetime[source]

None

created_by_id: str[source]

None

created_by_username: str[source]

None

status: str[source]

None

retry_at: datetime.datetime | None[source]

None

urls: str[source]

None

max_depth: int[source]

None

max_urls: int[source]

None

max_size: int[source]

None

tags_str: str[source]

None

config: dict[source]

None

static resolve_created_by_id(obj)[source]
static resolve_created_by_username(obj)[source]
static resolve_snapshots(obj, context)[source]
class archivebox.api.v1_crawls.CrawlUpdateSchema[source]

Bases: ninja.Schema

status: str | None[source]

None

retry_at: datetime.datetime | None[source]

None

tags: list[str] | None[source]

None

tags_str: str | None[source]

None

class archivebox.api.v1_crawls.CrawlCreateSchema[source]

Bases: ninja.Schema

urls: list[str][source]

None

max_depth: int[source]

0

max_urls: int[source]

0

max_size: int[source]

0

tags: list[str] | None[source]

None

tags_str: str = <Multiline-String>[source]
label: str = <Multiline-String>[source]
notes: str = <Multiline-String>[source]
config: dict[source]

None

class archivebox.api.v1_crawls.CrawlDeleteResponseSchema[source]

Bases: ninja.Schema

success: bool[source]

None

crawl_id: str[source]

None

deleted_count: int[source]

None

deleted_snapshots: int[source]

None

archivebox.api.v1_crawls.normalize_tag_list(tags: list[str] | None = None, tags_str: str = '') list[str][source]
archivebox.api.v1_crawls.get_crawls(request: django.http.HttpRequest)[source]
archivebox.api.v1_crawls.create_crawl(request: django.http.HttpRequest, data: archivebox.api.v1_crawls.CrawlCreateSchema)[source]
archivebox.api.v1_crawls.get_crawl(request: django.http.HttpRequest, crawl_id: str, as_rss: bool = False, with_snapshots: bool = False, with_archiveresults: bool = False)[source]

Get a specific Crawl by id.

archivebox.api.v1_crawls.patch_crawl(request: django.http.HttpRequest, crawl_id: str, data: archivebox.api.v1_crawls.CrawlUpdateSchema)[source]

Update a crawl (e.g., set status=sealed to cancel queued work).

archivebox.api.v1_crawls.delete_crawl(request: django.http.HttpRequest, crawl_id: str)[source]