abx_plugin_htmltotext.htmltotext
Module Contents
Classes
Functions
extract search-indexing-friendly text from an HTML document |
API
- abx_plugin_htmltotext.htmltotext.should_save_htmltotext(link: archivebox.index.schema.Link, out_dir: Optional[pathlib.Path] = None, overwrite: Optional[bool] = False) bool [source]
- abx_plugin_htmltotext.htmltotext.save_htmltotext(link: archivebox.index.schema.Link, out_dir: Optional[pathlib.Path] = None, timeout: int = ARCHIVING_CONFIG.TIMEOUT) archivebox.index.schema.ArchiveResult [source]
extract search-indexing-friendly text from an HTML document