archivebox.parsers
Everything related to parsing links from input sources.
For a list of supported services, see the README.md. For examples of supported import formats see tests/.
Submodules
archivebox.parsers.pinboard_rss
archivebox.parsers.shaarli_rss
archivebox.parsers.generic_rss
archivebox.parsers.readwise_reader_api
archivebox.parsers.generic_html
archivebox.parsers.wallabag_atom
archivebox.parsers.generic_txt
archivebox.parsers.medium_rss
archivebox.parsers.pocket_api
archivebox.parsers.pocket_html
archivebox.parsers.url_list
archivebox.parsers.generic_jsonl
archivebox.parsers.netscape_html
archivebox.parsers.generic_json
Package Contents
Functions
parse a list of URLS without touching the filesystem |
|
parse a list of URLs with their metadata from an RSS feed, bookmarks export, or text file |
|
download a given url’s content into output/sources/domain- |
Data
API
- archivebox.parsers.parse_links_memory(urls: List[str], root_url: Optional[str] = None)[source]
parse a list of URLS without touching the filesystem
- archivebox.parsers.parse_links(source_file: str, root_url: Optional[str] = None, parser: str = 'auto') Tuple[List[archivebox.index.schema.Link], str] [source]
parse a list of URLs with their metadata from an RSS feed, bookmarks export, or text file
- archivebox.parsers.run_parser_functions(to_parse: IO[str], timer, root_url: Optional[str] = None, parser: str = 'auto') Tuple[List[archivebox.index.schema.Link], Optional[str]] [source]