Changelog

▶️ If you’re having an issue with a breaking change, or migrating your data between versions, open an issue to get help.

ArchiveBox was previously named Pocket Archive Stream and then Bookmark Archiver.


THIS PAGE HAS BEEN MOVED: See the releases page for versioned source downloads and full changelog.

🍰 Many thanks to our 100+ contributors and everyone in the web archiving community! 🏛

Expand old release notes...

  • v0.2.4 released

  • better archive corruption guards (check structure invariants on every parse & save)

  • remove title prefetching in favor of new FETCH_TITLE archive method

  • slightly improved CLI output for parsing and remote url downloading

  • re-save index after archiving completes to update titles and urls

  • remove redundant derivable data from link json schema

  • markdown link parsing support

  • faster link parsing and better symbol handling using a new compiled URL_REGEX


  • v0.2.3 released

  • fixed issues with parsing titles including trailing tags

  • fixed issues with titles defaulting to URLs instead of attempting to fetch

  • fixed issue where bookmark timestamps from RSS would be ignored and current ts used instead

  • fixed issue where ONLY_NEW would overwrite existing links in archive with only new ones

  • fixed lots of issues with URL parsing by using urllib.parse instead of hand-written lambdas

  • ignore robots.txt when using wget (ssshhh don’t tell anyone 😁)

  • fix RSS parser bailing out when there’s whitespace around XML tags

  • fix issue with browser history export trying to run ls on wrong directory


  • v0.2.2 released

  • Shaarli RSS export support

  • Fix issues with plain text link parsing including quotes, whitespace, and closing tags in URLs

  • add USER_AGENT to archive.org submissions so they can track archivebox usage

  • remove all icons similar to archive.org branding from archive UI

  • hide some of the noisier youtubedl and wget errors

  • set permissions on youtubedl media folder

  • fix chrome data dir incorrect path and quoting

  • better chrome binary finding

  • show which parser is used when importing links, show progress when fetching titles


  • v0.2.1 released with new logo

  • ability to import plain lists of links and almost all other raw filetypes

  • WARC saving support via wget

  • Git repository downloading with git clone

  • Media downloading with youtube-dl (video, audio, subtitles, description, playlist, etc)


  • v0.2.0 released with new name

  • renamed from Bookmark Archiver -> ArchiveBox


  • v0.1.0 released

  • support for browser history exporting added with ./bin/archivebox-export-browser-history

  • support for chrome --dump-dom to output full page HTML after JS executes


  • v0.0.3 released

  • support for chrome --user-data-dir to archive sites that need logins

  • fancy individual html & json indexes for each link

  • smartly append new links to existing index instead of overwriting


  • v0.0.2 released

  • proper HTML templating instead of format strings (thanks to https://github.com/bardisty!)

  • refactored into separate files, wip audio & video archiving



  • v0.0.0 released: created Pocket Archive Stream 2017/05/05