Changelog¶
▶️ If you’re having an issue with a breaking change, or migrating your data between versions, open an issue to get help.
ArchiveBox
was previously named Pocket Archive Stream
and then Bookmark Archiver
.
See the releases page for versioned source downloads and full changelog.🍰 Many thanks to our 30+ contributors and everyone in the web archiving community! 🏛
- v0.2.4 released
- better archive corruption guards (check structure invariants on every parse & save)
- remove title prefetching in favor of new FETCH_TITLE archive method
- slightly improved CLI output for parsing and remote url downloading
- re-save index after archiving completes to update titles and urls
- remove redundant derivable data from link json schema
- markdown link parsing support
- faster link parsing and better symbol handling using a new compiled URL_REGEX
- v0.2.3 released
- fixed issues with parsing titles including trailing tags
- fixed issues with titles defaulting to URLs instead of attempting to fetch
- fixed issue where bookmark timestamps from RSS would be ignored and current ts used instead
- fixed issue where ONLY_NEW would overwrite existing links in archive with only new ones
- fixed lots of issues with URL parsing by using
urllib.parse
instead of hand-written lambdas - ignore robots.txt when using wget (ssshhh don’t tell anyone 😁)
- fix RSS parser bailing out when there’s whitespace around XML tags
- fix issue with browser history export trying to run ls on wrong directory
- v0.2.2 released
- Shaarli RSS export support
- Fix issues with plain text link parsing including quotes, whitespace, and closing tags in URLs
- add USER_AGENT to archive.org submissions so they can track archivebox usage
- remove all icons similar to archive.org branding from archive UI
- hide some of the noisier youtubedl and wget errors
- set permissions on youtubedl media folder
- fix chrome data dir incorrect path and quoting
- better chrome binary finding
- show which parser is used when importing links, show progress when fetching titles
- v0.2.1 released with new logo
- ability to import plain lists of links and almost all other raw filetypes
- WARC saving support via wget
- Git repository downloading with git clone
- Media downloading with youtube-dl (video, audio, subtitles, description, playlist, etc)
- v0.2.0 released with new name
- renamed from Bookmark Archiver -> ArchiveBox
- v0.1.0 released
- support for browser history exporting added with
./bin/archivebox-export-browser-history
- support for chrome
--dump-dom
to output full page HTML after JS executes
- v0.0.3 released
- support for chrome
--user-data-dir
to archive sites that need logins - fancy individual html & json indexes for each link
- smartly append new links to existing index instead of overwriting
- v0.0.2 released
- proper HTML templating instead of format strings (thanks to https://github.com/bardisty!)
- refactored into separate files, wip audio & video archiving
- v0.0.1 released
- Index links now work without nginx url rewrites, archive can now be hosted on github pages
- added setup.sh script & docstrings & help commands
- made Chromium the default instead of Google Chrome (yay free software)
- added env-variable configuration (thanks to https://github.com/hannah98!)
- renamed from Pocket Archive Stream -> Bookmark Archiver
- added Netscape-format export support (thanks to https://github.com/ilvar!)
- added Pinboard-format export support (thanks to https://github.com/sconeyard!)
- front-page of HN, oops! apparently I have users to support now :grin:?
- added Pocket-format export support
- v0.0.0 released: created Pocket Archive Stream 2017/05/05