Changelog

▶️ If you’re having an issue with a breaking change, or migrating your data between versions, open an issue to get help.

ArchiveBox was previously named Pocket Archive Stream and then Bookmark Archiver.


THIS PAGE HAS BEEN MOVED: See the releases page for versioned source downloads and full changelog.

🍰 Many thanks to our 100+ contributors and everyone in the web archiving community! 🏛

Expand old release notes...
  • v0.4.9 released

    • pip install archivebox https://pypi.org/project/archivebox/

    • docker run archivebox/archivebox https://hub.docker.com/r/archivebox/archivebox

    • https://archivebox.readthedocs.io/en/latest/

    • https://github.com/ArchiveBox/ArchiveBox/releases

  • easy migration from previous versions

    cd path/to/your/archive/folder
    archivebox init
    archviebox add 'https://example.com'
    archviebox add 'https://getpocket.com/users/USERNAME/feed/all' --depth=1
    
  • full transition to Django Sqlite DB with migrations (making upgrades between versions much safer now)

  • maintains an intuitive and helpful CLI that’s backwards-compatible with all previous archivebox data versions

  • uses argparse instead of hand-written CLI system: see archivebox/cli/archivebox.py

  • new subcommands-based CLI for archivebox (see below)

  • new Web UI with pagination, better search, filtering, permissions, and more

  • 30+ assorted bugfixes, new features, and tickets closed

  • for more info, see: https://github.com/ArchiveBox/ArchiveBox/releases/tag/v0.4.9


  • v0.2.4 released

  • better archive corruption guards (check structure invariants on every parse & save)

  • remove title prefetching in favor of new FETCH_TITLE archive method

  • slightly improved CLI output for parsing and remote url downloading

  • re-save index after archiving completes to update titles and urls

  • remove redundant derivable data from link json schema

  • markdown link parsing support

  • faster link parsing and better symbol handling using a new compiled URL_REGEX


  • v0.2.3 released

  • fixed issues with parsing titles including trailing tags

  • fixed issues with titles defaulting to URLs instead of attempting to fetch

  • fixed issue where bookmark timestamps from RSS would be ignored and current ts used instead

  • fixed issue where ONLY_NEW would overwrite existing links in archive with only new ones

  • fixed lots of issues with URL parsing by using urllib.parse instead of hand-written lambdas

  • ignore robots.txt when using wget (ssshhh don’t tell anyone 😁)

  • fix RSS parser bailing out when there’s whitespace around XML tags

  • fix issue with browser history export trying to run ls on wrong directory


  • v0.2.2 released

  • Shaarli RSS export support

  • Fix issues with plain text link parsing including quotes, whitespace, and closing tags in URLs

  • add USER_AGENT to archive.org submissions so they can track archivebox usage

  • remove all icons similar to archive.org branding from archive UI

  • hide some of the noisier youtubedl and wget errors

  • set permissions on youtubedl media folder

  • fix chrome data dir incorrect path and quoting

  • better chrome binary finding

  • show which parser is used when importing links, show progress when fetching titles


  • v0.2.1 released with new logo

  • ability to import plain lists of links and almost all other raw filetypes

  • WARC saving support via wget

  • Git repository downloading with git clone

  • Media downloading with youtube-dl (video, audio, subtitles, description, playlist, etc)


  • v0.2.0 released with new name

  • renamed from Bookmark Archiver -> ArchiveBox


  • v0.1.0 released

  • support for browser history exporting added with ./bin/archivebox-export-browser-history

  • support for chrome --dump-dom to output full page HTML after JS executes


  • v0.0.3 released

  • support for chrome --user-data-dir to archive sites that need logins

  • fancy individual html & json indexes for each link

  • smartly append new links to existing index instead of overwriting


  • v0.0.2 released

  • proper HTML templating instead of format strings (thanks to https://github.com/bardisty!)

  • refactored into separate files, wip audio & video archiving


  • v0.0.1 released

  • Index links now work without nginx url rewrites, archive can now be hosted on github pages

  • added setup.sh script & docstrings & help commands

  • made Chromium the default instead of Google Chrome (yay free software)

  • added env-variable configuration (thanks to https://github.com/hannah98!)

  • renamed from Pocket Archive Stream -> Bookmark Archiver

  • added Netscape-format export support (thanks to https://github.com/ilvar!)

  • added Pinboard-format export support (thanks to https://github.com/sconeyard!)

  • front-page of HN, oops! apparently I have users to support now :grin:?

  • added Pocket-format export support


  • v0.0.0 released: created Pocket Archive Stream 2017/05/05