Archiving .gov and .mil websites is going on now, with lots of help—but what if we could archive full government web services? This would mean keeping interactive sites that include databases and forms, available for future use even if the original website changes or is removed.
We like this idea because we would preserve how websites worked, not just what they looked like. As websites become more database driven and interactive, this would be a bigger help than the already helpful Wayback Machine.
We believe this is possible now given the increased use of virtual machines and cloud services. Webmasters are adjusting to having their systems work in an isolated environment and one that can be snapshot’d.
What we need are some webmasters who would like to try this. We think that government websites would be perfect because they tend to change as administrations change and the datasets are often public data.
If you run a website and would like to participate in this experiment or would like to help on the receiving end, please send a note to info@archive.org or reply to this post.
Archiving web services could usher in a completely new age in archiving of Internet resources.
I would enjoy helping out with this project. I might be a little rusty simple because I have not posted on my website in years. Nonetheless, if I can assist relative to this project kindly let me know.
Best.
Yes, we’ve begun doing some of this already at ibiblio.org as I described it here in 2015 (and earlier in a funded IBM Faculty Research Award).
http://www.ibiblio.org/archive/2015/10/the-web-going-dark-preserving-and-serving-aging-websites/
Obviously some sites with many outside links will be imperfectly preserved but working from the server side with restricted VMs has many advantages.
Restricted VMs done rightly can also keep abandoned or frozen sites available for a long time.
Let talk and let’s work on this.
I am agree with you #Archive Government Web Services
This is a really interesting and vital area to explore, but I do wonder how it can be done without the ongoing collaboration of the people who create and manage the websites? In the case of government websites, there is likely more than one group who are part of delivering and managing the web service as well as the data? Plus there are so many other things to consider about the capture and management of data from various perspectives. There might be some lessons here from existing research into data archiving and records management related to how to understand what is a database record? I also think that some of these web services and data might be being managed as records and evidence already and so I wonder how the IA could connect to how and why this is being done in order to address gaps in the cultural record would be very useful. This could be linked to research into what is meaningful about web ‘content’ for future use although this is still an evolving area of inquiry? I would be interested in hearing more about the approach and how this work is being conceptualized from technical and conceptual points of view.