Documentários
Guias e documentações
- Manual de defesa contra a censura nas escolas (sobre)
- Personal Digital Archiving | Digital Preservation - Library of Congress
- Websites, Blogs, Social Media - Personal Archiving | Digital Preservation - Library of Congress
- Web ARChive - Wikipedia
- Archiving web sites - LWN
- Archiving web sites - anarcat blog
- Restoring - Archive Team Wiki
Projetos, Comunidades e Organizações
- The Eye - non-profit website dedicated towards content archival and long-term preservation (site)
- Web Archiving Community.
- Queima de arquivo não! (PL 7920/2017 - Digitalização de Documentos)
- Climate change mirror (dados salvos / como ajudar)
- r/DataHoarder
- Archive Team / ArchiveBot.
- IIPC - netpreserve.org
- Community Owned digital Preservation Tool Registry (COPTR)
- DigiPres Commons Community-owned digital preservation resources
- PrestoSpace
- Emulation-As-A-Service
- Brasil.IO
- AUT Project
- Webarchive UNESCO
- Projeto Brasil Nunca Mais - hospedado no Ministério Público Federal
- The End of Term Web Archive.
- NetarchivesSuite (código)
- Common Crawl
- International Internet Preservation Consortium
- List of Web archiving initiatives
Softwares
- Wget com dica para baixar mais rápido
- Wget2
- httrack, httraqt (httrack GUI)), webhttrack e httrack-android
- crawl
- aria2 pacote
- wpull
- grab-site
- ckanext-harvest
- Web archiving using Google Chrome
- IIPC - Recurso geral sobre arquivo web
- Web Arquivos públicos
- Datatogether Research
- Comparacao entre os softwares
- Crawler do Internet Archive
- WebArchiver
- Warrick: free utility for reconstructing (or recovering) a website when a back-up is not available. Warrick utilizes the Memento Framework (http://www.mementoweb.org) to discover archived versions of resources from web archives. The resources are gathered to provide a single collection of files.
- Perma.cc - a service that helps prevent link rot: serviço pago mas que possui código aberto
- Wayback Machine Downloader: download an entire website from the Wayback Machine.
- OpenArchive:
- WARC:
- Indexing / full text search:
- Webarchive Discovery: WARC and ARC indexing and discovery tools: provides full-text search for our web archives.
- SolrWayBack: A search interface and wayback machine for the UKWA Solr based warc-indexer framework.
- shine
- warclight
- lockss-solr
- Assinatura de acervos:
Especificações
- RFC7089 - HTTP Framework for Time-Based Access to Resource States – Memento
- RFC5854 - The Metalink Download Description Format.
- Format Description Categories - Sustainability of Digital Formats | Library of Congress
- The WARC Format 1.1
- ARC_IA, Internet Archive ARC file format
- WARC, Web ARChive file format
- IIPC Framework Working Group - The WARC File Format (Version 0.9)
- Draft: Information and Documentation - The WARC file format - v1
- Format Descriptions for Archived Web Sites and Pages
- Wayback Machine API
- FAIR data (princípios, artigo na Wikipedia)
- The Open Definition