Keeping the data around
As you retrieve data from the government (or other sources), it’s easy to just consider the websites it has been released on as a permanent resource. Still, experience has shown that data does go away: whether it is through government re-designing its web sites, new policies that retract transparency rules or simple system failures.
At the same time, downloading complete copies of web sites - a process called mirroring - is a fairly well-established technique that can easily be deployed by civil society organisations. Mirroring involves an automated computer program (for a list see: http://en.wikipedia.org/wiki/Web_crawler) harvesting all the web pages from a specified web page, e.g. a ministry home page. In most cases, it is also possible to find old versions of web sites via the Internet Archive’s Wayback machine (http://archive.org/web/web.php), a project that aims to create up-to-date copies of all public web sites and archive them forever
- Improve this page Edit on Github Help and instructions
-
Donate
If you have found this useful and would like to support our work please consider making a small donation.