archivebox.cli package

Submodules

archivebox.cli.archivebox module

archivebox.cli.archivebox.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

archivebox.cli.archivebox_add module

archivebox.cli.archivebox_add.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Add a new URL or list of URLs to your archive

archivebox.cli.archivebox_config module

archivebox.cli.archivebox_config.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Get and set your ArchiveBox project configuration values

archivebox.cli.archivebox_help module

archivebox.cli.archivebox_help.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Print the ArchiveBox help message and usage

archivebox.cli.archivebox_info module

archivebox.cli.archivebox_info.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Print out some info and statistics about the archive collection

archivebox.cli.archivebox_init module

archivebox.cli.archivebox_init.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Initialize a new ArchiveBox collection in the current directory

archivebox.cli.archivebox_list module

archivebox.cli.archivebox_list.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

List, filter, and export information about archive entries

archivebox.cli.archivebox_manage module

archivebox.cli.archivebox_manage.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Run an ArchiveBox Django management command

archivebox.cli.archivebox_remove module

archivebox.cli.archivebox_remove.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Remove the specified URLs from the archive

archivebox.cli.archivebox_schedule module

archivebox.cli.archivebox_schedule.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Set ArchiveBox to regularly import URLs at specific times using cron

archivebox.cli.archivebox_server module

archivebox.cli.archivebox_server.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Run the ArchiveBox HTTP server

archivebox.cli.archivebox_shell module

archivebox.cli.archivebox_shell.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Enter an interactive ArchiveBox Django shell

archivebox.cli.archivebox_update module

archivebox.cli.archivebox_update.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Import any new links from subscriptions and retry any previously failed/skipped links

archivebox.cli.archivebox_version module

archivebox.cli.archivebox_version.main(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Print the ArchiveBox version and dependency information

archivebox.cli.logging module

class archivebox.cli.logging.RuntimeStats(skipped: int = 0, succeeded: int = 0, failed: int = 0, parse_start_ts: Optional[datetime.datetime] = None, parse_end_ts: Optional[datetime.datetime] = None, index_start_ts: Optional[datetime.datetime] = None, index_end_ts: Optional[datetime.datetime] = None, archiving_start_ts: Optional[datetime.datetime] = None, archiving_end_ts: Optional[datetime.datetime] = None)[source]

Bases: object

mutable stats counter for logging archiving timing info to CLI output

skipped = 0
succeeded = 0
failed = 0
parse_start_ts = None
parse_end_ts = None
index_start_ts = None
index_end_ts = None
archiving_start_ts = None
archiving_end_ts = None
class archivebox.cli.logging.SmartFormatter(prog, indent_increment=2, max_help_position=24, width=None)[source]

Bases: argparse.HelpFormatter

Patched formatter that prints newlines in argparse help strings

archivebox.cli.logging.reject_stdin(caller: str, stdin: Optional[IO] = <_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>) → None[source]

Tell the user they passed stdin to a command that doesn’t accept it

archivebox.cli.logging.accept_stdin(stdin: Optional[IO] = <_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>) → Optional[str][source]

accept any standard input and return it as a string or None

class archivebox.cli.logging.TimedProgress(seconds, prefix='')[source]

Bases: object

Show a progress bar and measure elapsed time until .end() is called

end()[source]

immediately end progress, clear the progressbar line, and save end_ts

archivebox.cli.logging.progress_bar(seconds: int, prefix: str = '') → None[source]

show timer in the form of progress bar, with percentage and seconds remaining

archivebox.cli.logging.log_parsing_started(source_file: str)[source]
archivebox.cli.logging.log_parsing_finished(num_parsed: int, num_new_links: int, parser_name: str)[source]
archivebox.cli.logging.log_indexing_process_started(num_links: int)[source]
archivebox.cli.logging.log_indexing_process_finished()[source]
archivebox.cli.logging.log_indexing_started(out_path: str)[source]
archivebox.cli.logging.log_indexing_finished(out_path: str)[source]
archivebox.cli.logging.log_archiving_started(num_links: int, resume: Optional[float] = None)[source]
archivebox.cli.logging.log_archiving_paused(num_links: int, idx: int, timestamp: str)[source]
archivebox.cli.logging.log_archiving_finished(num_links: int)[source]
archivebox.cli.logging.log_archive_method_started(method: str)[source]
archivebox.cli.logging.log_archive_method_finished(result: archivebox.index.schema.ArchiveResult)[source]

quote the argument with whitespace in a command so the user can copy-paste the outputted string directly to run the cmd

archivebox.cli.logging.log_list_started(filter_patterns: Optional[List[str]], filter_type: str)[source]
archivebox.cli.logging.log_list_finished(links)[source]
archivebox.cli.logging.log_removal_started(links: List[archivebox.index.schema.Link], yes: bool, delete: bool)[source]
archivebox.cli.logging.log_removal_finished(all_links: int, to_keep: int)[source]
archivebox.cli.logging.log_shell_welcome_msg()[source]
archivebox.cli.logging.pretty_path(path: str) → str[source]

convert paths like …/ArchiveBox/archivebox/../output/abc into output/abc

archivebox.cli.logging.printable_filesize(num_bytes: Union[int, float]) → str[source]
archivebox.cli.logging.printable_folders(folders: Dict[str, Optional[archivebox.index.schema.Link]], json: bool = False, csv: Optional[str] = None) → str[source]
archivebox.cli.logging.printable_config(config: importlib._bootstrap.ConfigDict, prefix: str = '') → str[source]
archivebox.cli.logging.printable_folder_status(name: str, folder: Dict[KT, VT]) → str[source]
archivebox.cli.logging.printable_dependency_version(name: str, dependency: Dict[KT, VT]) → str[source]

archivebox.cli.tests module

archivebox.cli.tests.output_hidden(show_failing=True)[source]
class archivebox.cli.tests.TestInit(methodName='runTest')[source]

Bases: unittest.case.TestCase

setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

test_basic_init()[source]
test_conflicting_init()[source]
test_no_dirty_state()[source]
class archivebox.cli.tests.TestAdd(methodName='runTest')[source]

Bases: unittest.case.TestCase

setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

test_add_arg_url()[source]
test_add_arg_file()[source]
test_add_stdin_url()[source]
class archivebox.cli.tests.TestRemove(methodName='runTest')[source]

Bases: unittest.case.TestCase

setUp()[source]

Hook method for setting up the test fixture before exercising it.

test_remove_exact()[source]
test_remove_regex()[source]
test_remove_domain()[source]
test_remove_none()[source]

Module contents

archivebox.cli.list_subcommands() → Dict[str, str][source]

find and import all valid archivebox_<subcommand>.py files in CLI_DIR

archivebox.cli.run_subcommand(subcommand: str, subcommand_args: List[str] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None[source]

Run a given ArchiveBox subcommand with the given list of args

archivebox.cli.help(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Print the ArchiveBox help message and usage

archivebox.cli.version(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Print the ArchiveBox version and dependency information

archivebox.cli.init(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Initialize a new ArchiveBox collection in the current directory

archivebox.cli.info(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Print out some info and statistics about the archive collection

archivebox.cli.config(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Get and set your ArchiveBox project configuration values

archivebox.cli.add(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Add a new URL or list of URLs to your archive

archivebox.cli.remove(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Remove the specified URLs from the archive

archivebox.cli.update(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Import any new links from subscriptions and retry any previously failed/skipped links

archivebox.cli.list(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

List, filter, and export information about archive entries

archivebox.cli.shell(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Enter an interactive ArchiveBox Django shell

archivebox.cli.manage(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Run an ArchiveBox Django management command

archivebox.cli.server(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Run the ArchiveBox HTTP server

archivebox.cli.schedule(args: Optional[List[str]] = None, stdin: Optional[IO] = None, pwd: Optional[str] = None) → None

Set ArchiveBox to regularly import URLs at specific times using cron