archivebox.misc.hashing

Module Contents

Functions

_cached_file_hash

Internal function to calculate file hash with cache key based on path, size and mtime.

hash_file

Calculate SHA256 hash of a file with caching based on path, size and mtime.

get_dir_hashes

Calculate SHA256 hashes for all files and directories recursively.

get_dir_entries

Get filtered list of directory entries.

get_dir_sizes

Calculate sizes for all files and directories recursively.

get_dir_info

Get detailed information about directory contents including hashes and sizes.

API

archivebox.misc.hashing._cached_file_hash(filepath: str, size: int, mtime: float) str[source]

Internal function to calculate file hash with cache key based on path, size and mtime.

archivebox.misc.hashing.hash_file(file_path: pathlib.Path, pwd: pathlib.Path | None = None) str[source]

Calculate SHA256 hash of a file with caching based on path, size and mtime.

archivebox.misc.hashing.get_dir_hashes(dir_path: pathlib.Path, pwd: pathlib.Path | None = None, filter_func: collections.abc.Callable | None = None, max_depth: int = -1) dict[str, str][source]

Calculate SHA256 hashes for all files and directories recursively.

archivebox.misc.hashing.get_dir_entries(dir_path: pathlib.Path, pwd: pathlib.Path | None = None, recursive: bool = True, include_files: bool = True, include_dirs: bool = True, include_hidden: bool = False, filter_func: collections.abc.Callable | None = None, max_depth: int = -1) tuple[str, ...][source]

Get filtered list of directory entries.

archivebox.misc.hashing.get_dir_sizes(dir_path: pathlib.Path, pwd: pathlib.Path | None = None, **kwargs) dict[str, int][source]

Calculate sizes for all files and directories recursively.

archivebox.misc.hashing.get_dir_info(dir_path: pathlib.Path, pwd: pathlib.Path | None = None, filter_func: collections.abc.Callable | None = None, max_depth: int = -1) dict[source]

Get detailed information about directory contents including hashes and sizes.