Tag Archives: distributed Web

Internet Archive, Code for Science and Society, and California Digital Library to Partner on a Data Sharing and Preservation Pilot Project

Research and cultural heritage institutions are facing increasing costs to provide long-term public access to historically valuable collections of scientific data, born-digital records, and other digital artifacts. With many institutions moving data to cloud services, data sharing and access costs have become more complex. As leading institutions in decentralization and data preservation, the Internet Archive (IA), Code for Science & Society (CSS) and California Digital Library (CDL) will work together on a proof-of-concept pilot project to demonstrate how decentralized technology could bolster existing institutional infrastructure and provide new tools for efficient data management and preservation. Using the Dat Protocol (developed by CSS), this project aims to test the feasibility of a decentralized network as a new option for organizations to archive and monitor their digital assets.

Dat is already being used by diverse communities, including researchers, developers, and data managers. California Digital Library is building innovative tools for data publication and digital preservation. The Internet Archive is leading efforts to advance the decentralized web community. This joint project will explore the issues that emerge from collecting institutions adopting decentralized technology for storage and preservation activities. The pilot will feature a defined corpus of open data from CDL’s data sharing service. The project aims to demonstrate how members of a cooperative, decentralized network can leverage shared services to ensure data preservation while reducing storage costs and increasing replication counts. By working with the Dat Protocol, the pilot will maximize openness, interoperability, and community input. Linking institutions via cooperative, distributed data sharing networks has the potential to achieve efficiencies of scale not possible through centralized or commercial services. The partners intend to openly share the outcomes of this proof-of-concept work to inform further community efforts to build on this potential.

Want to learn more? Representatives of this project will be at FORCE 2018, Joint Conference on Digital Libraries, Open Repositories, DLF Forum, and the Decentralized Web Summit.

More about CSS: Code for Science & Society is a nonprofit organization committed to building public interest technology and low-cost decentralized tools with the Dat Project to help people share and preserve versioned digital information. Read more about CSS’ Dat in the Lab project, our recent Community Call, and other activities. (Contact: Danielle Robinson)

More about CDL UC3: The University of California Curation Center (UC3) at the California Digital Library (CDL) provides innovative data curation and digital preservation services to the 10-campus University of California system and the wider scholarly and cultural heritage communities. https://uc3.cdlib.org/. (Contact: John Chodacki)

More about IA: The Internet Archive is a non-profit digital library with the mission to provide “universal access to all knowledge.” It works with hundreds of national and international partners providing web, data, and preservation services and maintains an online library comprising millions of freely-accessible books, films, audio, television broadcasts, software, and hundreds of billions of archived websites. https://archive.org/. (Contact: Jefferson Bailey)

Locking the Web Open, a Call for a Distributed Web

Presentation by Brewster Kahle, Internet Archive Digital Librarian at Ford Foundation NetGain gathering, — a call from 5 top foundations to think big about prospects for our digital future.  (More detailed version)


Hi, I’m Brewster Kahle, Founder of the Internet Archive. For 25 years we’ve been building this fabulous thing—the Web. I want to talk to you today about how can we Lock the Web Open.


Code=LawOne of my heroes, Larry Lessig, famously said that “Code is Law.” The way we code the Web will determine the way we live online. So we need to bake our values into our code.

Freedom of expression needs to be baked into our code. Privacy should be baked into our code. Universal access to all knowledge. But right now, those values are not embedded in the Web.


IA_serversIt turns out that the World Wide Web is very fragile. But it is huge. At the Internet Archive we collect 1 billion pages a week. We now know that Web pages only last about 100 days on average before they change or disappear. They blink on and off in their servers.


map_China_RussiaAnd the Web is massively accessible, unless you live in China. The Chinese government has blocked the Internet Archive, the New York Times, and other sites from its citizens. And so do other countries every once in a while.


Censorship_flic.kr_p_gZZRQvSo the Web is not reliableAnd the Web isn’t private. People, corporations, countries can spy on what you are reading. And they do. We now know that Wikileaks readers were targeted by the NSA and the UK’s equivalent. We, in the library world, know the value of reader privacy.


It is FunBut the Web is fun. We got one of the three things right. So we need a Web that is Reliable, Private but is still Fun. I believe it is time to take that next step. And It’s within our reach.

Imagine “Distributed Web” sites that are as functional as Word Press blogs, Wikimedia sites, or even Facebook. But How?


Tubes_flic_kr_p_89HvvdContrast the current Web to the internet—the network of pipes that the World Wide Web sits on top of. The internet was designed so that if any one piece goes out, it will still function. The internet is a truly distributed system. What we need is a Next Generation Web; a truly distributed Web.


Peer2PeerHere’s a way of thinking about it: Take the Amazon Cloud. The Amazon Cloud works by distributing your data. Moving it from computer to computer—shifting machines in case things go down, getting it closer to users, and replicating it as it is used more. That’s a great idea. What if we could make the Next Generation Web work that, but across the entire internet, like an enormous Amazon Cloud?

In part, it would be based on Peer-to-peer technology—systems that aren’t dependent on a central host or the policies of one particular country. In peer-to-peer models, those who are using the distributed Web are also providing some of the bandwidth and storage to run it.

Instead of one web server per website we would have many. The more people or organizations that are involved in the distributed Web, the safer and faster it will become. The next generation Web also needs a distributed authentication system without centralized log-in and passwords. That’s where encryption comes in.


PrivateAnd it also needs to be Private—so no one knows what you are reading. The bits will be distributed—across the Net—so no one can track you from a central portal.


 MemoryAnd this time the Web should have a memory. We’d build in a form of versioning, so the Web is archived thru time. The Web would no longer exist in a land of the perpetual present.

Plus it still needs to be Fun—malleable enough spur the imaginations of a millions of inventors. How do we know that it can work? There have been many advances since the birth of the Web in 1992.


Blockchain_JavaWe have computers that are 1000 times faster. We have JAVAScript that allows us to run sophisticated code in the browser. So now readers of the distributed web could help build it. Public key encryption is now legal, so we can use it for authentication and privacy. And we have Block Chain technology that enables the Bitcoin community to have a global database with no central point of control.


NewWebI’ve seen each of these pieces work independently, but never pulled together into a new Web. That is what I am challenging us to do.

Funders, and leaders, and visionaries– This can be a Big Deal. And it’s not being done yet! By understanding where we are headed, we can pave the path.


DistributedWebLarry Lessig’s equation was Code = Law. We could bake the First Amendment into the code of a next generation Web.

We can lock the web open.
Making openness irrevocable.
We can build this.
We can do it together.


Delivered February 11, 2015 at the Ford Foundation-hosted gathering: NetGain, Working Together for a Stronger Digital Society