Weaving Books into the Web—Starting with Wikipedia

[announcement video, Wired]

The Internet Archive has transformed 130,000 references to books in Wikipedia into live links to 50,000 digitized Internet Archive books in several Wikipedia language editions including English, Greek, and Arabic. And we are just getting started. By working with Wikipedia communities and scanning more books, both users and robots will link many more book references directly into Internet Archive books. In these cases, diving deeper into a subject will be a single click.

Moriel Schottlender, Senior Software Engineer, Wikimedia Foundation, speech announcing this program

“I want this,” said Brewster Kahle’s neighbor Carmen Steele, age 15, “at school I am allowed to start with Wikipedia, but I need to quote the original books. This allows me to do this even in the middle of the night.”

For example, the Wikipedia article on Martin Luther King, Jr cites the book To Redeem the Soul of America, by Adam Fairclough. That citation now links directly to page 299 inside the digital version of the book provided by the Internet Archive. There are 66 cited and linked books on that article alone. 

In the Martin Luther King, Jr. article of Wikipedia, page references can now take you directly to the book.

Readers can see a couple of pages to preview the book and, if they want to read further, they can borrow the digital copy using Controlled Digital Lending in a way that’s analogous to how they borrow physical books from their local library.

“What has been written in books over many centuries is critical to informing a generation of digital learners,” said Brewster Kahle, Digital Librarian of the Internet Archive. “We hope to connect readers with books by weaving books into the fabric of the web itself, starting with Wikipedia.”

You can help accelerate these efforts by sponsoring books or funding the effort. It costs the Internet Archive about $20 to digitize and preserve a physical book in order to bring it to Internet readers. The goal is to bring another 4 million important books online over the next several years.  Please donate or contact us to help with this project.

From a presentation on October 23, 2019 by Moriel Schottlender, Tech lead at the Wikimedia Foundation.

“Together we can achieve Universal Access to All Knowledge,” said Mark Graham, Director of the Internet Archive’s Wayback Machine. “One linked book, paper, web page, news article, music file, video and image at a time.”


25 thoughts on “Weaving Books into the Web—Starting with Wikipedia

  1. Pingback: 130,000 References to Books Cited in Wikipedia Articles Now Include Live Links to 50,000 Books Digitized by The Internet Archive (with More to Come) | LJ infoDOCKET

  2. Pingback: Fighting Misinformation Online | Internet Archive Blogs

    1. Brewster Kahle Post author

      I hope so. We are trying to build towards the Memex, Xanadu, the World Wide Web, the Global Brain.

      They have great ideas… lets build them!

      -brewster

  3. Pingback: Fighting Misinformation Online - RSSFeeds

  4. RIchard Mahony

    Well done Brewster Kahle, Mark Graham, Moriel Schottlender and friends! This is a great step forward.

    Wikipedia relies on secondary and tertiary sources, which all too often are misquoted or misattributed by Wikipedia editors. When I check a cited Wikipedia source, often the cited source fails to support the claim made in the Wikipedia article, or even contradicts it.

    Linking directly to the page in the cited book, therefore, is a great step forward. The next step is to provide the same option with direct links to the page and paragraph in cited papers in the pay-per-view peer-reviewed press and to the page in a newspaper that is behind a paywall.

    Sci-Hub at least gives access to many if not most peer reviewed articles in the academic press, but Sci-Hub doesn’t give access to an article in the Times of London, say, or in another Murdoch rag, secured behind a paywall.

    The hardest of all is accessing old documentaries, many of which have been destroyed in fire or flood.

    Sometimes, it seems that getting access to the simple facts, perhaps the only extant preserved copy being hidden away somewhere in a dusty box in somebody’s attic, is harder than accessing Donald Trump’s tax returns, or the full transcript of Trump’s telephone conversation with the President of Ukraine.

    Furthermore, I’ve discovered over the past fifty years that all too often even the most supposedly reliable books themselves misrepresent, misunderstand or misquote the primary sources, and even the secondary and tertiary sources.

    Wikipedia extols editors not to use primary sources, because of the undoubted issues that arise from citing primary sources. But nor should anybody ever rely on a secondary or a tertiary source, whether it’s a text book compiled by the most highly regarded scholar, a politician’s memoir, or a reporter’s investigation.

    One must always try to check the primary record. Easier said than done, I know, especially if in a language one doesn’t understand.

    But if the path to knowledge is hard, then the path to the truth is harder still.

  5. Michael McCulley

    Well, books are great sources of facts and information, but take a long time to reach publication. Articles and research publications are faster to bring facts to light, and I’d think you and Wikipedia could do more to provide article and research citations –not just books. Just a thought…
    Best,
    DrWeb

  6. Erika Herzog

    Embedding a url Link is not efficient. There needs to be a link created like with JSTOR or OCLC. One that could reference a page number or not.

  7. Gwendolyn Oliver

    Well, I’m an English graduate so you clearly know what reading and literature and books mean to me. I stress it almost every day.

  8. Ed Summers

    This is an amazing initiative. I was wondering if there are any more details about how this linking is working. I noticed that the link to To Redeem the Soul of America in the MLK article was created by a person who works at the Internet Archive?

    https://en.wikipedia.org/w/index.php?title=Martin_Luther_King_Jr.&diff=prev&oldid=898271785

    Is there a collaboration happening with Wikipedia editors, perhaps through WikiCite? Are you trying to do any automated linking of citations, similar to what you did with broken links on Wikipedia?

  9. Charlotte

    Wikipedia is okay on some subjects, but if you look up alternative health articles, for instance, Big Pharma paid Wikipedia ‘editors’ (censors) change the article! Same thing happens on some other subjects.

    This Wikipedia censorship problem is just one of many on the Internet today.

    1. Michael J. Lowrey

      Wikipedia articles by definition lean on actual legitimate peer-reviewed science. That does not equate to “the Big Pharma demons rule the Wikipedia”. Find some solid science behind your concepts, and put that into the articles.

  10. John G

    I applaud such initiatives, but all of these ideas to make knowledge easier to access shouldn’t blind one to the fact that the main things standing between most people and knowlege are laziness and short attention spans. (Of course, conspiracy mongering absorbs a lot of mental attention, as witnessed in this thread, but that is a separate issue.)

    To see the truth of this, go to an assortment of, say, five different comment sections on various blogs. Find 10 statements you know to be false. Copy them, and see how long it takes you to find reliable sources of information that disprove these false statements.

    I hate to be a buzz kill, but if there is one thing that I have realized since 1995, it is that no one is starved for stimulating and educational reading material unless they’re on a diet of their own choosing.

    And, no, these dieters are not all of one ideology. You can find them no matter what corner of the Overton window you’re pointing at.

  11. Daniel Munyola

    This is good ideas where by upgrading libraries for easy choices. It is very important to tackle on such important issues that boost education, especially in universities and colleges for more practical and experience . This will engage many who likes study. The advantage of this collaboration is that,since I my self experienced a cross different libraries to find more articles,that assisted on doing research beyond. Now that is vision improved way of learning any book..

  12. Nemo

    I’ve started adding a few links from the Italian Wikipedia. This is one of the most famous Italian singers of the 1960s…
    https://it.wikipedia.org/w/index.php?title=Luigi_Tenco&diff=prev&oldid=108818328

    (The quotation of his almost-last words before committing suicide, attributed the the book that the Internet Archive scanned and made available, was apparently wrong. Now I fixed at least the citation; someone with access to the second source will be needed to see where the quotation came from.)

Comments are closed.