archivebox.parsers package¶
Submodules¶
archivebox.parsers.generic_json module¶
archivebox.parsers.generic_rss module¶
archivebox.parsers.generic_txt module¶
archivebox.parsers.medium_rss module¶
archivebox.parsers.netscape_html module¶
archivebox.parsers.pinboard_rss module¶
archivebox.parsers.pocket_html module¶
archivebox.parsers.shaarli_rss module¶
Module contents¶
Everything related to parsing links from input sources.
For a list of supported services, see the README.md. For examples of supported import formats see tests/.
-
archivebox.parsers.
parse_links_memory
(urls: List[str])[source]¶ parse a list of URLS without touching the filesystem
-
archivebox.parsers.
parse_links
(source_file: str) → Tuple[List[archivebox.index.schema.Link], str][source]¶ parse a list of URLs with their metadata from an RSS feed, bookmarks export, or text file
-
archivebox.parsers.
save_text_as_source
(raw_text: str, filename: str = '{ts}-stdin.txt', out_dir: str = '/home/docs/checkouts/readthedocs.org/user_builds/archivebox/checkouts/v0.4.13/docs') → str[source]¶