Experimenting with One Million Album Covers

Rising to the challenge to create an image search engine using a corpus of one million album covers,  Professor Trenary of Western Michigan University lead a class project that found many exact matches (same file) and many near matches.

Their algorithm matched some that were not the same because it used rough shape matching, and many images were just of the CD or LP label which matched.

Screen Shot 2015-06-30 at 6.49.39 PM

While not at a point of being ready for production use for the Archive, they wrote a nice report on their findings that might be useful to others.   The Internet Archive hopes to enable many more studies using the data in the collection.

Thank you to Brandon Arrendondo,  James Jenkins, Austin Jones, and Professor Trenary.

Posted in Audio Archive | Leave a comment

NEW at the Archive Store! MS-DOS “Game Not Over!” T-Shirt

IAtshirt-gamenotover

Designed by Jason Scott, the new “Game Not Over!” T-shirt is a celebration of the over 2,000 MS-DOS games that are once again available to play on archive.org. The shirt is currently available in a number of sizes at the Internet Archive store.

All proceeds go to the Internet Archive. Go to store.archive.org to get yours.

Posted in Cool items, Emulation, Games, News, Software Archive | 2 Comments

Experiment with One Million Album Covers

coversAs might be expected, the Internet Archive has lots of data in its virtual stacks. Besides the books, movies and stored webpages, there are datasets provided from the Internet at large or from individual contributors.

But datasets are just big clumps of data unless someone does something with them. Obviously we’re keeping these around no matter what (our current goal is “forever”), but without folks tinkering, experimenting and using the data sets, they’re just piles clogging up hard drives.

So, in the name of experimentation, we’ve put together one million album cover images from a variety of sources, and put them into this item. The total size is 148 gigabytes (!) of .JPG, .GIF and .PNG images. (There is a torrent on the item, allowing you a more flexible way to download that amount of imagery.)

The albums are somewhat-arbitrarily split according to filename, with .TAR (tape archive) files for the letter a, b, c, etc.  The goal here is experimentation – these have not been curated, overly quality checked, or any differently-sized doubles removed. If you’re writing programs or doing analysis, these are the sorts of oddness or strangeness you should be aware of.

(If you just want to play around a bit, there’s a link to a set of a mere 1200 album covers, for a total of 200 megabytes.)

We’ve included some suggestions for using the data, and some projects that might be interesting to get into, either as a hacking project or just because you’re learning computer science.

Let us know how it works for you!

 

Posted in Announcements, News | 1 Comment

The first Netlabel Day – Join the event

The Internet Archive has a large (over 58,000 items) and growing collection of netlabels. Recently we received a message asking to help announce a new global event, Netlabels Day. Please support it if you are part of the netlabels world.

netlabelsThe Record Store Day was created on 2007 to celebrate the record stores on the USA and the rest of the world. In that celebration, independent bands and labels releases music exclusively for that day on vinyl, seizing the revival of that format. This was the base of the Netlabel Day, a sort of distant relative of RSD, that pretends to install a new tradition releasing digital music every 14 July from now on.

This initiative was born in Chile thanks to Manuel Silva, from M.I.S.T. Records, and it reunites more than 50 labels from all over the world. All genres are present: Rock, pop, electronic, noise, ambient and many many more, free and just for you.

We will upload every single release on Archive.org, because we love this platform. We always use it and we’ve never experimented any issues with it. Every album will be available for free on WAV and FLAC via direct download, or torrent as well.

The most important thing is to include everyone in this idea. We will close the call on June 1, so if you have a netlabel and you want to be part of this, please email us to contact.netlabelday@gmail.com. If you are an independent artist without any label associated, you can release your music with us too and be listened by every participating netlabel, so just contact us from May 15 to June 1.

Everyone is invited. Be part of this madness!

Links:
http://netlabelday.blogspot.com
http://www.facebook.com/netlabelday
http://www.twitter.com/netlabel_day

Posted in Audio Archive, Event, News | 1 Comment

Thank you, Robert Miller, for 2.5 million Books for Free Public Access

Robert MillerI am both sad and happy that Robert Miller has accepted another position so will be leaving the Internet Archive after 10 years of fantastic achievements. He joined to help create a mass movement of libraries bringing themselves digital by scanning books, microfilm, and other media. He has succeeded in doing this by creating positive relationships and distributed teams, working in 30 libraries in 8 countries, to help libraries go digital.

And thank you to Robert, for building organizational and partnership structures that will continue bring more collections online, long into the future. His endless energy and ability to forge long term relationships to create processes that are both efficient and library-careful have been miraculous to behold. The future looks bright and brighter because of his work.

Working with 1000 contributing libraries, the Internet Archive has digitized and offered free public access to over 2.5 million literary works, we are now on our way to the goal of 10 million books, being served by our sites and the sites of thousands of libraries.

With thousands of libraries serving digital materials in new and different ways to their different communities, we can achieve the diverse but coordinated access and preservation opportunity of our digital age. We look forward to the next steps in the programs that have been started with gusto and relish.

Thank you, Robert. We expect more great things in coming years.

-brewster
Founder, Digital Librarian

Posted in Books Archive, News | 5 Comments

Making Your DOS Programs Live Again at the Internet Archive

MSDOSSince the beginning of the year, the Internet Archive has been making a large amount of DOS-based games and programs run in the browser, much like our Console Living Room and Internet Arcade collections. Many thousands of people have stopped by and tried out these programs, enjoying such classics as Llamatron 2112 or Dangerous Dave. With countless examples of DOS programs going back spanning 30 years, there’s lots of great software to try out and experiment with. Here’s a great place to start.

If you want to just try out the software, we’re done here. Go into our stacks and have a great time!

However, some people have asked about adding DOS software they created or which they have which isn’t part of our collections, and especially how to make these programs boot in a window like our currently available programs do.

This is a quick guide to getting your DOS programs up and emulating in the browser. If any of these instructions are unclear to you, please contact the Software Curator at jscott@archive.org.

Please note: these instructions are for DOS programs, not Windows programs.

First, you should register for your Internet Archive library card if you haven’t already.

getcardNext, you should upload your DOS software as a .ZIP file. It is important that your program and any support files be inside a single .ZIP file and not uploaded separately.

uploadWhen you upload, you’ll be asked to fill out all sorts of information about your program. Be sure to be as complete as possible, including the description, date of creation, who the author or authors were, and so on. You’re the curator of this software – help the world understand why they should look at it!

Set the “Collection” to Community Software.

Finally, at the bottom of this upload screen, there is an add additional metadata option.

metadataAdd these two metadata pairs:

  • Set “emulator” to “dosbox”.
  • Set “emulator_ext” to “zip”.

Finally, and this is very important … inside the .ZIP file you uploaded is the program that starts the program running. It might be an .EXE, .BAT or .COM file.  For example, if your ZIP file has a single file in it, called LEMON.EXE, then that’s the program that “starts” your program.

  • Set “emulator_start” to this program.

After double-checking your work, click on “Upload and Create your Item” and the system will upload your program to the Archive, and if all goes well, your program will be emulated in our pages after a few minutes.

Again, if you have any questions or experience any issues, contact Jason Scott, the software curator at the Archive, at jscott@archive.org.

Let’s bring the DOS prompt back! And let a thousand programs bloom!

 

Posted in Software Archive | Comments Off on Making Your DOS Programs Live Again at the Internet Archive

Help Free PACER–Cast your Vote for Free Court Records at the Internet Archive this Friday!

Public Resource Postcard  Internet activist and founder of Public.Resource.org, Carl Malamud is launching a national campaign to free millions of court documents in PACER–Public Access to Court Electronic Records–the technologically backwards federal electronic system that charges Americans 10 cents per page to access court files in the public domain.  This Friday, you can come by the Internet Archive “polling place” at 300 Funston Avenue., San Francisco from 8 a.m. to 5 p.m. to “cast your vote” for free court records.  Carl will be on hand with inspiring postcards addressed to Chief Judge Thomas of the Ninth Circuit Court of Appeals.  By sending His Honor hundreds of handwritten postcards asking him to grant a PACER fee-exemption,  we can save tax-payers millions of dollars, while freeing court documents crucial to understanding and interpreting the law.

This is just one prong in a multi-faceted campaign to free PACER.  Carl outlines Friday’s strategy in a memorandum of law called, “Yo, Your Honor.”  His request of us:

May 1 is Law Day, and I’m asking people to come in and write a brief postcard about why you think that access to PACER is important. More specifically, you’ll be writing a postcard to Chief Judge Thomas of the Ninth Circuit of the U.S. Court of Appeals in support of my request that the Court grant us free access to PACER for several courts in the Ninth Circuit. It would be a really big deal if the Court said yes, we’re trying to show public support in a way the judges can relate to.

Photo of PACER PostcardsYou can also send your postcard directly if you can’t make it to the Internet Archive on Friday:

Clerk of the Court
Attn: Docket 15-80056
United States Courts of Appeals
James Browning Courthouse
95 7th Street
San Francisco, CA 94103

 

In 2008, Aaron Swartz downloaded millions of PACER documents, and worked with Malamud to make them accessible for free on the Internet Archive through the RECAP Project.  This is just one more step toward providing everyone with free access to all knowledge–the great promise of the Internet and our mission at the Internet Archive.

 

 

Posted in Announcements, News | 7 Comments

The Evolving Internet Archive

v2concert

The new archive.org site

The new version of the archive.org site has been evolving over the past 6 months in response to the feedback we’ve received from thousands of our awesome users.

If you haven’t been following along, you can review a little bit of the journey through these blog posts:

Why change the site at all?  The posts above help answer that, but in brief:

  • 35% of our ~3 million daily users are on mobile/tablet devices, and the classic site is not easy to use on small formats.
  • The new tools we want to offer our users would be difficult to implement in the old site architecture.
  • The classic site was built a long time ago, using methods that are outdated.  Finding programmers who have the skills to work in that environment is becoming increasingly difficult, and the ramp up time for new employees is painful.  The redesign has given us an opportunity to start pulling the front end (what you see) apart from the back end, so they can evolve separately.
percent of archive.org users viewing the new site

Blue represents people in classic archive.org (v1), red represents people in the new version (v2)

Currently about 85% of archive.org users are in the new version. Over the next few weeks we will be asking the remaining 15% to try it out.  For the time being, users will be able to exit exitthe new archive.org and return to the “classic” version — but the classic will not always be available or supported, so please give the new version a try and give us feedback if there are things on the site that you don’t like, can’t find, or that seem like bugs.  (When you click “exit” you will have an opportunity to give us feedback.)

We have made several video tours that introduce you to the new site. I recommend starting with the site tour, below.

down-button

The original download button

In the past few months we have received more than 16,000 feedback emails from people using the new version.  The redesign team reads every single one of them.  Some just say, “I love it!” and some immediately say, “I hate it!”  But a great many of you have also taken the time to share a little more – something you missed from the old site, a question about the new tools, concern about accessibility, suggestions for how to adjust things, etc.

Download menu open by default

Download menu open by default

We took that input — along with information from user tests, interviews with some of our power users, chats with partners — and tried to identify areas of the interface that seemed to be working well, and other areas that were not.

The evolution of downloading files from items is a great example of the process we’ve been following.  The original design for item pages de-emphasized download as a feature. Our conversations with users told us that most people wanted to hit a play button, not download a file.

You could still download in the original design, of course, but you had to click a button to get options and then click again if you wanted specific files.

But when we opened the new site up to more users, we got many comments from people who either disliked the extra clicking, didn’t like leaving the page to get individual files, didn’t understand what the options represented, or couldn’t find the download options at all.

The first thing we tried was just opening up the download menu by default.  Instead of just seeing the black download button on the page, you now also saw a menu of options.  More people saw the download, but feedback made it clear that users still had issues.

What if we make it blue?  (Nope!)

What if we make it blue? (Nope!)

We thought perhaps if we increased the visibility of the download options by turning the Download header blue that people would see it faster.  We did an A/B test with 50% of users seeing each option — neither option really won.  And the feedback about this feature continued to be negative.

It became clear that we needed to rethink the design of the download options all together, trying to keep it clean-looking and easy to use while also satisfying the concerns of our most advanced users.

We set some goals for the download changes based on the feedback we had received:

  • must be able to download an individual file without leaving the item page
  • if there is only one file in a particular format, you should only need one click to download it
  • improve the ability to download groups of files (e.g. “just give me all the FLAC files”)

The current version of downloads allows you to consume individual media files without leaving the page and gives you a lot more options for downloading groups of files from an item.  Since we released the new Download Options feature, the negative feedback about this feature has dropped off almost entirely.  So we think we’re on the right track!  We have created a short video tour for the downloads feature if you want to learn more.

New Download Options feature, illustrating how to display individual files

New Download Options feature, illustrating how to display individual files

The download changes are just one example of how much your feedback has helped us identify areas of confusion on the site and understand how to improve things.  Here are a few more examples:

  • A-Z filters available when sorting by title or creator
  • better experience for people with javascript disabled
  • fixes to improve software emulation
  • default search results to List view (instead of image-based Thumbnail view)
  • pull user page images from gravatar if available (if user has not uploaded one)

We have a lot more in store for the new site – better accessibility for sight disabled people, tools for creating your own collections, improved playback for multimedia items, etc.  As these features trickle into the site, we hope you will continue to share your questions and ideas with us – you are truly helping us to make the archive a better place for everyone.

This project receives support from the John S. and James L. Knight Foundation’s Knight News Challenge.

Posted in Announcements, Archive Version 2, News | 31 Comments

Two Grants Announced Supporting Web Archiving

We are excited to announce Internet Archive’s participation in two new grant-funded collaborative projects to advance the field of web archiving! Our Archive-It service, which works with libraries, archives, museums and others to provide the tools for institutions to create their own web archives, will partner with New York University and Old Dominion University on two separate areas of work. We thank both The Andrew W. Mellon Foundation and the Institute of Museum and Library Services (IMLS) for their recognition of the value of web archiving and their support for the continued development of tools and initiatives to expand the quality, accessibility, and extensibility of these collections. We also thank our awesome collaborative partners on these projects, New York University Libraries, NYU’s Moving Image Archiving and Preservation (MIAP) program, and Old Dominion University’s Web Science and Digital Libraries Research Group and look forward to working with them as part of our broader initiative for “Building Libraries Together.”

For the project “Archiving the Websites of Contemporary Composers,” led by NYU Libraries and funded with a grant of $480,000 from The Andrew W. Mellon Foundation, we will work with the Libraries and MIAP.  This project will archive web-based and born-digital audiovisual materials, and research and develop tools for their improved capture and discoverability. Contemporary musical works, as well as the rich secondary materials that accompany them, are increasingly migrating to the web. We outlined a number of current challenges to capturing and replaying online multimedia, such as dynamic and transient URL generation and adaptive bitrate streaming, as well as a need for continued research and development around the integration of web archives and non-web collections.

We have two specific pieces of work in the grant. First, we will build tools to improve the crawling and capture of web-based audiovisual materials, addressing the increasing complexity of streaming audiovisual materials, especially on third-party hosting and sharing platforms. This development work will build on our experience creating “Heritrix helper” tools like Umbra. Our second area of work will explore methods to integrate discovery of high-quality, non-web multimedia content held in external repositories into the Archive-It platform. Linking Archive-It collections with non-web institutional content has great potential to integrate web and non-web archives. This work will build on NYU’s creation of an API for their preservation repository, our increased use of API-based systems integration in Archive-It 5.0, and our continued work on improved content discovery for web collections. See NYU’s press release for more details.

The second recently-announced grant project is being lead by Old Dominion University’s Web Science and Digital Libraries Research Group, which received a $468,618 National Leadership Grant for Libraries from IMLS for the project, “Combining Social Media Storytelling With Web Archives” (grant number LG-71-15-0077). Readers not familiar with ODU’s great history of research and development around web archives are encourage to check out projects such as WARCreate/WAIL, their work on visualizations and Archive-It, and our recent favorite, the #whatdiditlooklike tool. In this project ODU will be building tools and processes to assimilate user-focused, online storytelling methods, such as Storify, to 1) summarize existing collections and 2) bootstrap new or expand existing web archive collections. The project will provide new ways to create unique topical and thematic collections through URLs shared via social media and storytelling platforms.

We will be working with them to integrate these tools in Archive-It, conduct user testing and training, and explore other ways that storytelling and user-generated materials can help build narrative pathways into large, often diffuse, collections of web content. We are excited to work with ODU and continue our increased focus on new models of access for web archives, as many institutional web collections are now of a breadth, volume, and operational maturity to begin focusing on novel ways their web archives can be studied and better understood by users and researchers.

Thanks again to Mellon Foundation and IMLS for supporting these cooperative efforts to advance web archiving and we are excited to work with our great partners and the broader community to keeping preserving and expanding access to the rich historical and cultural record documented on the web.

Posted in Announcements, Archive-It | 3 Comments

Will We Let Congress Vote to Fast-Track Secret Trade Deals?

Yesterday, legislation was introduced in the US Senate that would enable Congress to fast-track approval of secret trade agreements by Republican Orrin Hatch and Democrat Ron Wyden. The timing is important because the President is currently pushing for the approval of the Trans-Pacific Partnership, an agreement negotiated in secret meetings with international lawmakers that has serious ramifications for a host of important issues, Internet privacy and intellectual property among them.

We are worried that Congress’ and the public’s ability to review, discuss, and debate proposed agreements would be significantly limited by this bill. It would also force Congress to have a strict yes/no vote on the presented agreement, with no ability to make amendments beforehand.

The impacts of these agreements and the international rules that they impose upon citizens and Internet users across the globe are too sweeping to be coordinated behind closed doors and then presented in a short window for a straight up and down vote.

There is still time for concerned individuals and organizations to resist this push, as we did with SOPA, PIPA, and the threat to net neutrality.

For more information and organized ways to take action, see the Electronic Frontier Foundation’s write-up and the Internet Vote campaign.

Posted in Announcements, News | 3 Comments

Internet Archive and CADAL Partner to Digitize 500,000 Academic Texts

The Internet Archive and the Chinese Academic Digital Associative Library (CADAL), are pleased to announce that 500,000 English-language, academic books will be digitized through a partnership that leverages strengths from both organizations. This furthers an initiative begun in 2009, The China-US Million Book Digital Library Project, seeking to bring one million texts into the public domain.

“We are working together with a valuable global partner, CADAL, to create a digital library of high quality, academic, eBooks for use in China, North America and the world at large; I couldn’t be happier!” Robert Miller, General Manager of Digital Libraries for the Internet Archive, remarked on the collaboration.

The Chinese Academic Digital Associative Library (CADAL) is a consortium of over 70 Chinese University Libraries. CADAL will provide access to a leading set of libraries, the technical resources to display, and share the books inside China, as well as the staff needed for digitization. The Internet Archive will select the books, and provide equipment and processing resources. Both organizations will offer access and discovery tools for both scholars and citizen-scholars. Together, CADAL and the Internet Archive are contributing to a growing, global digital library.

Chen Huang, Digital Librarian and Deputy Director of Administrator Center for CADAL, shared the vision for the project: “We are pleased to be working with the Internet Archive. Together, we have developed a program that will allow Chinese university students to have access to materials that will enhance both specific knowledge, and exposure to broad trends and ideas.”

This phase of the partnership will last about 3 years and involve teams in the US, Shenzhen, China and ZheJiang University in Hang Zhou, China.


The Internet Archive is a non-profit library with over 6 million texts online and a popular global website, with 34 million downloads a month. Their mission is “Universal Access to All Knowledge”https://archive.org/

Contact Robert@Archive.org for more information.

The China Academic Digital Associative Library (CADAL) is a long term project of the Ministry of Education of China. The consortium aims to construct an academic digital library with high-level technology and abundant digital resources that are multidisciplinary, multilingual, and categorically diverse. http://www.cadal.cn/

Contact service@cadal.cn for more information.

Posted in Books Archive, News | 5 Comments

Sharing Data for Better Discovery and Access

horizontal_logo_standard_Jan2015
The Internet Archive and the Digital Public Library of America (DPLA) are pleased to announce a joint collaborative program to enhance sharing of collections from the Internet Archive in the Digital Public Library of America (DPLA).

ia-logo-220x221The Internet Archive will work with interested libraries and content providers to help ensure their metadata meets DPLA’s standards and requirements. After their content is digitized, the metadata would then be ready for ingestion into the DPLA if the content provider has a current DPLA provider agreement.

The DPLA is excited to collaborate with the Internet Archive in this effort to improve metadata quality overall, by making it more consistent with DPLA requirements, including consistent rights statements. Better data means better access. In addition to providing DPLA compliant metadata services, the Internet Archive also offers a spectrum of digital collection services, such as digitization, storage and preservation. Libraries, archives and museums who chose Internet Archive as their service provider have the added benefit of having their content made globally available through Internet Archive’s award winning portals, OpenLibrary.org and Archive.org.

“We are thrilled to be working with the DPLA”, states Robert Miller, Internet Archive General Manager of Digital Libraries. “With their emphasis on providing not only a portal and a platform, but also their advocacy for public access of content, they are a perfect partner for us”.

Rachel Frick, DPLA Business Development Director says, “The Internet Archive’s mission of ‘Universal Access to All Knowledge’, coupled with their end-to-end digital library solutions complements our core values.”

Program details are available upon request. Please contact:
Rachel Frick – DPLA Business Development Director, Rachel@dp.la
Robert Miller – General Manager of Digital Libraries, robert@archive.org

Posted in Announcements, Books Archive, News, Open Library | 1 Comment

You are invited to a Party for GETDecentralized–Wednesday April 1 at the Internet Archive

Screen Shot GETD Logo

 

 

Help Us Lift the Fog on Decentralization!

The GETDecentralized community wants to do something fundamental: “To transform bureaucratic hierarchies into technology-driven networks” (Fred Wilson).

The Internet Archive and Jolocom invite you to GETDecentralized! An evening of conversation, celebration and community-building around new ideas in decentralization.

GETD Party will be Wednesday, April 1st at the Internet Archive in San Francisco!

Location: The Internet Archive, 300 Funston Avenue, San Francisco, CA 94118

Schedule:
6:00 — 7:00 pm, Reception
7:00 — 7:30 pm, Speakers (including Brewster Kahle and Markus Sabadello)
7:30 — 8:30 pm, Reception and Tours of the Internet Archive

Markus Sabadello, a long-time decentralization activist and hacker, will take us on a tour of the new technologies of decentralization. Learn what “decentralization” means and how we can all benefit from it. Markus runs his own open-source effort “Project Danube,” which is based on XRI/XDI technology and experiments with user-centric identity, personal data storage and Vendor Relationship Management.

Also Brewster Kahle, Founder & Digital Librarian, Internet Archive, will share his ideas about a “Locking the Web Open” through decentralized technologies. He’ll lead a tour of this digital universal library — 20 Petabytes of our culture’s books, films, music, software and Web pages. Hope to see you next Wednesday!

RSVP Today!

Posted in News | Comments Off on You are invited to a Party for GETDecentralized–Wednesday April 1 at the Internet Archive

Political Ads Win Over News 45 to 1 in Philly TV News 2014

[press: Columbia Journalism Review, USA Today, BloombergPolitics, Washington Post]

Study finds 842 minutes of political Ads compared to 18.7 minutes of political news stories in large sample of Philadelphia TV news programs archived by the Internet Archive in a joint project.

In the closing eight weeks of the 2014 campaign, political candidates and outside groups bombarded viewers of Philadelphia’s major TV stations with nearly 12,000 ads designed to sway voters in the Nov. 4 elections. But the stations that benefited from political advertisers’ $14 million spending spree also appear to have devoted little time to political journalism. A study of a representative sampling of newscasts on those stations put the ratio of time devoted to political advertising and spent on substantive political news stories at 45:1.

Political Ads & Local TV News – Philly 2014, by Danilo Yanich

These are the findings of a University of Delaware team lead by Associate Professor Danilo Yanich. The university’s Center for Community Research and Service researchers collaborated with the Internet Archive, The Sunlight Foundation, and the Committee of Seventy – the 100+ year-old Philadelphia-based political watchdog organization.

Our joint pilot project, Philly Political Media Watch, worked to open a library of all television news from stations based in and around Philadelphia and index the political ads presented in their newscasts. The ads were joined with information on who paid how much for them.  The Sunlight Foundation was able to unearth those financial data from being buried in PDF disclosures every TV stations is required to submit to the Federal Communications Commission. The experimental project was supported by individual contributors and grants from the Democracy Fund and the Rita Allen Foundation.

Philly TV Market AreaThe Philadelphia television market was chosen as a 2014 laboratory to experiment how the interaction between news media and political money; to learn lessons that could be taken to scale across the nation in 2016. The Philadelphia region is the nation’s 4th largest TV market, 19% African American, and includes parts of three states. In 2014, important contests in the region included races for: Pennsylvania governor, a Delaware U.S. Senate seat, two open congressional seats in New Jersey and an open state Senate seat in suburban Philadelphia.

The six major Philadelphia metro TV stations carried 8,003 political ads in their news broadcasts between September 8 and Election Day. As Yanich’s report notes, political strategists have long acknowledged that they try to place ads during or near news programming because it attracts the highest proportion of likely voters.

Here is a sample program from the Delaware study.  This 60-minute WCAU, a NBC affiliate, program aired at 5:00pm the day before the elections.  It offered two substantive political stories.  One about election day poll hours and the other about the leading candidates for governor commenting on their attack ads.  Good set up.  Questions of incumbent elicit an unequivocal assessment of opponent’s assertions.   Followed by other candidate asked if his ads are negative.  Seemingly timely and germane.  Quiz: Can you find WCAU’s mistake followed sometime later by an unacknowledged correction?

Although WCAU clearly addressed important election issues, that same 60 minute program was also stuffed with 24 political ads.  Here is one, below.  Quiz: Can you spot the word “EBOLA”?  And for extra credit: which is more toxic to our Republic, this kind of ad or the disease?

Although local TV station marketing directors are more than happy to accommodate the needs of political ad buyers, the  local news directors appear to take a less supportive view of their audience’s interest in politics. Yanich and his research team looked at a representative sample of the news programs (390 of 1,256) and found politics taking a back seat to other types of stories in terms both of time and placement in the broadcast. The Delaware researchers found that many of the political stories aired were blandly informational, describing candidate schedules or appearances. Isolating political stories that focused on substantive political issues, Yanich’s team found that during the broadcasts they analyzed, there 18.7 minutes of those stories, compared to 842 minutes of political ads, a ratio of 45:1.

Next Steps

With so much heat, where will citizens find the light they need to navigate through this onslaught of political messaging?

Internet_Archive 2016 Political Ad TrackerThe Internet Archive has begun to welcome new collaborators to join us in tackling the challenge of creating timely information resources for the 2016 U.S. election cycles. Data individuals and civic organizations can trust when considering how to participate in some of their community’s most important decision making. Reliable information they can use to hold television stations accountable for the choices they make in balancing obligations to serve the information needs of their communities and the allure of one of their biggest revenues sources: political advertising.

How might we better inform voters and increase civic participation before, during and after elections?

 

 

Posted in Announcements, News | 2 Comments

Open Source Housing for Good

This is from a talk given by Brewster Kahle,  Founder and Digital Librarian of the Internet Archive, at Commonwealth Club panel titled Open Source Housing for Good on March 9th.  [covered by KQED public radio]

Foundation Housing

Foundation Housing

Our employees are being driven from their homes by rising rents; they are commuting great distances because of the lack of affordable housing; they are living in insecurity because of the fluctuation in rent and home prices.

Internet Archive - Non-Profit Library

Internet Archive – Non-Profit Library

I believe it is becoming harder to attract and keep good people working in nonprofits, including the Internet Archive, because of this problem.

Our employees spend an average of 30-60% of their income on housing. 30-60%.

That is a lot more than the “spend less than 25% on housing” that HUD recommends. Turns out that this is not just our employees, and not just the bay area. According to a Harvard study, the average American renter pays 30-60% of their incoming on housing. Similarly, homeowners pay about the same, except for those lucky few that own their houses outright.

The Bay Area is particularly problematic because rents and house prices have been rapidly rising, which is causing dislocations or people feeling locked into apartments and jobs. Nonprofits are particularly hit because their funding does not rise and fall as fast as the market fluctuations. Further, when the market is down, it is exactly the time you want non-profit services to be strong.

So the Internet Archive, and I would say other nonprofits as well, have an existential problem: affordable and stable employee housing.

The Internet Archive and the Kahle/Austin Foundation are trying a new model to help. Foundation Housing as a name for a new housing class : Permanently Affordable housing for non-profit workers.

In this model, a new nonprofit, the Kahle/Austin Foundation House, has been set up to purchase apartment buildings. These rental units are then made available to employees of select nonprofits at a “debt free” rate– basically equivalent the condominium fee and taxes. Typically, the debt makes up about 2/3 of the cost of a building and the other costs (tax+maintenance+insurance) makes up about 1/3.    Since the employee does not pay the debt part, the monthly fee is now about $850-1000/month rather than $2700-3000 current market rent.   This way, the fee to those employees is about 1/3 of the cost of market rent, and we believe more stable than market based rents.

Walking Distance To Work

Foundation Housing Residents

Currently, this is being tried with an 11 unit apartment building in San Francisco 6 blocks from the Internet Archive. As apartments have become available through normal attrition — we do not force the existing tenants out– the Foundation house has made units available to 2 nonprofits, and there are now 3 employees living there. Having a walking commute, lower housing cost, and a nice neighborhood has been well received.

Roxanna used to commute over an hour each way from Bay View on 3 buses, and raising her 8 year old daughter in a building that had drug dealers actively dealing.  Now she walks 6 blocks to work, pays less, and feels safer.

Michelle is a librarian who was being evicted from her apartment and would have left San Francisco and probably would not be now working at the Internet Archive.

And Samantha worried that her rent was continuously on the rise, thinking she might have to leave the city in a few years, likes that the building is feeling more like a community and less like than an anonymous number in an apartment building.

Having housing provided as part of an employee benefit is similar to faculty housing, military, monasteries, and some hospital housing. But having to leave your apartment upon leaving your job is a negative aspect of this model. We have not seen the effect of this because no one has left yet.

So we think we have a model… but how do we make it permanent, and how do we finance it? To help make it permanent, we are borrowing ideas from the free and open source world and creative-commons licenses.  “Some Rights Reserved” rather than “All Rights Reserved“. “Share and Share Alike” rather than “Get Off My Property”.  With free-and-open-source software, the writer is giving up some of the profit potential in return for increased community participation. In the Foundation House, the supporters are giving up the ability to flip the building for a profit in return for making a permanent asset for the public good.

To finance the creation of these, we have thought of 4 ways, and are trying 3 of them already:

We built a credit union with this idea in mind, called the Internet Credit Union. It has plenty of deposits to start creating Foundation Housing, but alas, the credit union regulators (indirectly controlled by the banks) are not allowing us to make mortgages. This is a sad state of affairs for our nations new credit unions, but is not the subject of this talk.

We have tried the “endowment” approach with the current Foundation House, where we appealed to major donors for an endowment in the form of a building. The attraction is that it is much like an endowment, but instead of having money in a Goldman Sachs account, where they do their magic to make some return, the building-as-endowment is both good deal financially, and helps the nonprofit support their employees.

Beyond this, we would like to look into raising money through a low-interest bond, say for $100 million, to government and local investors, to fund the purchase of these houses, then using market based renters to pay off the bond. This way the buildings would slowly transition into debt-free Foundation Housing.   We have not tried this yet.

Lastly, and maybe most promisingly, there are people that are looking for new answers and participating in conversations like this.  A number of people in the Bay Area are starting co-working spaces and group houses .  When these are being started can be a good time to set up a structure to work off debt and keeping it off — then use the benefits to perpetuate a mission. While still in formation, there seems to be interest from people like Jessy Kate and others.    This could be helped by creating a Foundation Housing License that others could adopt or remix.

With about 10% of all employees in the US working in the non-profit sector, maybe we could hope for 5% of US housing to become Foundation Housing to provide stable, affordable housing for those dedicating themselves to service.

Lets create more debt-free Foundation Housing for non-profit workers!

 

[Other pieces on this]

Posted in News | 2 Comments

You are Invited to a Party: Victory for the Net

The event was a success, with resulting video and press.

victory7


 

 

 

 

 

 

 

Dear Friend of the Open Internet,

 

JOIN US!

FCC Chairman, Tom Wheeler, wants to do something monumental: reclassify broadband access providers under Title II of the Communications Act.

Translation: we’ve made huge progress in the fight to protect the Open Internet. And it’s time to celebrate!

The Internet Archive & Electronic Frontier Foundation invite you to VICTORY FOR THE NET! An evening of celebration, conversation, and sharing what’s next. The party will be Thursday, February 26 at the Internet Archive, 300 Funston Avenue, San Francisco, from 6-9 p.m.

The FCC still has to vote on Chairman Wheeler’s proposal and we don’t know the exact details yet. What we do know is that we’ve all worked hard to get the agency on the right track at last. We’re not done yet, but we have a lot to celebrate.

We are joining hands with our friends and co-hosts from:
Free Press, 18 Million Rising, Center for Media Justice–home of the Media Action Grassroots Network, Common Cause, Daily Kos, Demand Progress, Fight For the Future, Media Alliance, Progressive Change Campaign Committee, Public Knowledge, San Francisco Bitcoin, San Francisco Mayor’s Office of Civic Innovation, The Greenlining Institute, The Utility Reform Network and to take stock of how far we have come, and where we are headed in the movement to protect the Open Internet.

Hope to see you next Thursday! RSVP Today!

Brewster Kahle
Founder & Digital Librarian
Internet Archive

Posted in Announcements, News | Tagged , | 24 Comments

What’s new with v2

As many of you have already seen, we are working on the next generation of the archive.org web site, which we call Version 2.0 (v2). It’s in beta right now, so go check it out!

trybeta

Version 1 (v1) showing the banner to try the BETA Version 2 (v2)

We get a lot of feedback from the people who have elected to try out v2, and we read ALL of it. As themes emerge about what people are having trouble with, we make changes to the design and then we pay attention to subsequent feedback to try to gauge whether we solved the problem (or not).

volumes

Volume prepended to title

The goal of this redesign is to make the site more inviting and easier to use. Right now our work is focused on how the site looks and how things are organized on the page. For the most part, everything that is available to you in Version 1 (v1) of the site is available to you in v2 – but those things may be in different places!

Rights information displayed in About tab

Rights information displayed in About tab

We have a lot of long-time users of the site, and we know that any major changes will cause them to have to relearn where things are and how to accomplish the things they already know how to do on v1. This kind of major change can be very annoying, so we’re working hard to make sure you only need to relearn things once. While we will be adding more features as time goes by, we expect those changes to be incremental and not to affect the basic layout of pages.

If you’ve been using v2, you’ve probably noticed some changes over the last few weeks. I’ll discuss some of those changes here, and some of them are highlighted in the included images.

about

The collection About tab contains a longer description, info about contributors, and stats for reviews, forums, views and items

Volume information.  We have a lot of journals and books with Volume information that was not showing in search, collection or account pages. The volume information is now prepended to the title for easier visual scanning within a collection.

Live Music. Rights information for a collection is now displayed on the About tab. We also changed the way shows are described in band collections to list the date and venue before the band name, making it easier to visually scan the items in a collection.

Mobile. On most mobile devices we decreased the initial number of search results from 50 to 25 in order to lighten the page load time.

Collections Page

Go to list view for a collection and click the "Show details" checkbox

Go to list view for a collection and click the “Show details” checkbox

Collection description. The description area for the collection at the top of the page has been shortened. We encourage collection builders to add useful descriptions, and you can see the additional information in the new About tab.

Click to see additional collections for an item

Click to see additional collections for an item

About tab. The About tab replaces the Contributors tab. We wanted to have a place for all of the information about a collection, and “Contributors” didn’t cover it. The new About tab contains the longer description for a collection, rights information (when it exists), data about how many reviews and forum posts are in that collection, and the content from the previous Contributors tab – the collection creator, people who have added to the collection, and charts for Views and Items over time.  You will also find related collections listed on the About tab below the graphs. Parent collections and subcollections still show up in the Collections tab, since they are part of a collection’s direct hierarchy.

The See All Files page

The See All Files page

Collection tab. The Collection tab has a few changes as well. In list view, you can now “show details” for each item if you want to see more information.

Item Pages

Additional collections. If an item belongs to more than one collection, you can choose to view those additional collections.

Upload tile on user account page

Upload tile on user account page

Stream only. When an item is not available for download, you will see a “Stream Only” notification where the “Download” button normally appears. We made some visual changes to this notification to make it seem less button-like.

Favorites list sorted by Date Favorited

Favorites list sorted by Date Favorited

See All Files. In the “see all files” view, “playable” media files are pushed to the top, just under the “all files” options for torrent and zip. Files are grouped logically, with the original first and bolded and the derivative files listed below.

User Account Page

uploadicon Uploads. Your Uploads tab has a new “Upload” tile in it, just to make uploading easier to find. You can still upload from anywhere on the site by clicking the upload icon at the top of the page, of course.

Favorites. Your Favorites list (called bookmarks in v1) will now display your favorites sorted by “date favorited” so that you can see your most recently favorited items first.

Tell Us!

As always, please use the Beta feedback link in the top right corner to let us know what you think.  Is everything awesome?  Are you confused about where to find something?  Tell us!

If you’re interested in a more detailed running log of changes from our lead developer, Tracey Jaquith, you can get the “nerd version” here: https://archive.org/CHANGELOG.txt

This project receives support from the John S. and James L. Knight Foundation’s Knight News Challenge.

Posted in Archive Version 2 | Comments Off on What’s new with v2

Locking the Web Open, a Call for a Distributed Web

Presentation by Brewster Kahle, Internet Archive Digital Librarian at Ford Foundation NetGain gathering, — a call from 5 top foundations to think big about prospects for our digital future.


Hi, I’m Brewster Kahle, Founder of the Internet Archive. For 25 years we’ve been building this fabulous thing—the Web. I want to talk to you today about how can we Lock the Web Open.


Code=LawOne of my heroes, Larry Lessig, famously said that “Code is Law.” The way we code the Web will determine the way we live online. So we need to bake our values into our code.

Freedom of expression needs to be baked into our code. Privacy should be baked into our code. Universal access to all knowledge. But right now, those values are not embedded in the Web.


IA_serversIt turns out that the World Wide Web is very fragile. But it is huge. At the Internet Archive we collect 1 billion pages a week. We now know that Web pages only last about 100 days on average before they change or disappear. They blink on and off in their servers.


map_China_RussiaAnd the Web is massively accessible, unless you live in China. The Chinese government has blocked the Internet Archive, the New York Times, and other sites from its citizens. And so do other countries every once in a while.


Censorship_flic.kr_p_gZZRQvSo the Web is not reliableAnd the Web isn’t private. People, corporations, countries can spy on what you are reading. And they do. We now know that Wikileaks readers were targeted by the NSA and the UK’s equivalent. We, in the library world, know the value of reader privacy.


It is FunBut the Web is fun. We got one of the three things right. So we need a Web that is Reliable, Private but is still Fun. I believe it is time to take that next step. And It’s within our reach.

Imagine “Distributed Web” sites that are as functional as Word Press blogs, Wikimedia sites, or even Facebook. But How?


Tubes_flic_kr_p_89HvvdContrast the current Web to the internet—the network of pipes that the World Wide Web sits on top of. The internet was designed so that if any one piece goes out, it will still function. The internet is a truly distributed system. What we need is a Next Generation Web; a truly distributed Web.


Peer2PeerHere’s a way of thinking about it: Take the Amazon Cloud. The Amazon Cloud works by distributing your data. Moving it from computer to computer—shifting machines in case things go down, getting it closer to users, and replicating it as it is used more. That’s a great idea. What if we could make the Next Generation Web work that, but across the entire internet, like an enormous Amazon Cloud?

In part, it would be based on Peer-to-peer technology—systems that aren’t dependent on a central host or the policies of one particular country. In peer-to-peer models, those who are using the distributed Web are also providing some of the bandwidth and storage to run it.

Instead of one web server per website we would have many. The more people or organizations that are involved in the distributed Web, the safer and faster it will become. The next generation Web also needs a distributed authentication system without centralized log-in and passwords. That’s where encryption comes in.


PrivateAnd it also needs to be Private—so no one knows what you are reading. The bits will be distributed—across the Net—so no one can track you from a central portal.


 MemoryAnd this time the Web should have a memory. We’d build in a form of versioning, so the Web is archived thru time. The Web would no longer exist in a land of the perpetual present.

Plus it still needs to be Fun—malleable enough spur the imaginations of a millions of inventors. How do we know that it can work? There have been many advances since the birth of the Web in 1992.


Blockchain_JavaWe have computers that are 1000 times faster. We have JAVAScript that allows us to run sophisticated code in the browser. So now readers of the distributed web could help build it. Public key encryption is now legal, so we can use it for authentication and privacy. And we have Block Chain technology that enables the Bitcoin community to have a global database with no central point of control.


NewWebI’ve seen each of these pieces work independently, but never pulled together into a new Web. That is what I am challenging us to do.

Funders, and leaders, and visionaries– This can be a Big Deal. And it’s not being done yet! By understanding where we are headed, we can pave the path.


DistributedWebLarry Lessig’s equation was Code = Law. We could bake the First Amendment into the code of a next generation Web.

We can lock the web open.
Making openness irrevocable.
We can build this.
We can do it together.


Delivered February 11, 2015 at the Ford Foundation-hosted gathering: NetGain, Working Together for a Stronger Digital Society

Posted in Announcements, News | Tagged , , , , , | 14 Comments

Internet Archive Supports Critical Updates to Electronic Privacy Law in California

The California Electronic Communications Privacy Act (CalECPA), a newly introduced bill in California, would help bring state law up to date and require law enforcement to get a warrant before searching private online accounts or personal electronic devices. The Internet Archive is pleased to join a long and diverse list organizations and companies supporting CalECPA. To learn more, see write-ups by State Senator Mark Leno’s office, the ACLU of California, and the Electronic Frontier Foundation.

Posted in News | Comments Off on Internet Archive Supports Critical Updates to Electronic Privacy Law in California

$4 Million Available for Digitization in 2015 Application Deadline is April 30th Let’s Apply Together!

Internet Archive wants to partner with you to bring your ‘Hidden Collections’ into the public domain and become part of a global digital library!

The Council on Library and Information Resources (CLIR) with generous support from the Andrew W. Mellon Foundation has launched Digitizing Hidden Special Collections and Archives: Enabling New Scholarship through Increasing Access to Unique Materials.

This competition will award up to $4 Million to institutions, consortia and collaborative groups to digitize and provide access to collections of rare and ephemeral material with high scholarly value.

CLIR endeavors that “Digitizing Hidden Collections will enhance the emerging global digital research environment in ways that support new kinds of scholarship for the long term,ensuring that the full wealth of resources held by institutions of cultural memory becomes integrated with the open Web” (http://www.clir.org/hiddencollections/about-the-program).The focus of these grants is to bring entire collections into the public domain,while promoting strategic partnerships and best practices for ensuring preservation and accessibility that is both stable and enduring.

Grants of between $50,000 and $250,000 for a single-institution project, or between $50,000 and $500,000 for a collaborative project may be sought for work beginning between January 1st and June 1st, 2016 and be completed by May 31st, 2019. (http://www.clir.org/hiddencollections/applicants)

How Can the Internet Archive Digitization Team Help?

ttscribe

Let’s Cooperate on Your Grant Together – marry your great content with our end-to-end digitization skills to get your content up online safely and inexpensively.

We offer a Total Digitization Solution. Starting with non-destructive image capture, to storage and preservation, and ending with online discovery and access, our digitization solution saves you from having to worry about these details.

Translatable Metadata. Our existing relationship with Digital Public Library of America provides a possible route for your materials to join DPLA’s growing national collection.

Our Global Team Digitizes over 1000 eBooks and items every day. No need to reinvent the wheel. With our experience, training and engineering skills, we supply an end-to-end solution that allows our library partners and content contributors to focus on developing their collections, not on the back end details. For those new to digitization, we have the skills to help you avoid the common and costly mistakes of starting up a project.

We Don’t Just Digitize Books! Over the last decade, our format capabilities have expanded to: archival finds/ ephemera; microfilm and microfiche; audio; film and video; TV News; software and web. Let’s also apply together for grants to digitize other formats!

Many of Our Partnerships Have Been Consortial. We are proud to have driven projects for the Boston Library Consortium (BLA), LYRASIS, Consortium of Academic Libraries in Illinois (CARLI), Biodiversity Heritage Library and Ontario Council of University Libraries (OCUL), among others. This means collections can be contributed by more than one institution, with funding issued centrally and distributed locally.

Far-flung Collections Come Together With Internet Archive. Our collections gather material from international contributors in one place; in the public domain. In some cases this has meant repatriating material digitally across great distances. Highlights include collections from the Medical Heritage Library, Biodiversity Library and Genealogy (in collaboration with FamilySearch).

Preparing Your Grant—What can Internet Archive Do?

ttscribe-scanning

Large and Small-Scale Digitization Capabilities. Take advantage of our experience working with collection sizes – ranging from hundreds of thousands of items to unique collections with only dozens of one-of-a-kind monographs.

We Can Tailor The Project to Your Needs. Having worked with over 1275 content providers during the last decade, our processes can be adjusted to meet your requirements.

Our Equipment and Software has been tested and Proven. Our non-destructive digitization process can be done inside your library by IA staff, or in one of our regional centers. The images can even be captured by you! We have a new Table Top Scribe system that can be purchased if your institution wishes to do the image capture in-house. It is portable, easy to use, and uploads material directly to archive.org. Our service package provides the technical back-end processes including preserving and ‘future-proofing’ your digital data 25 years, AND organizing your collections online so they can be discovered and used for scholarly research.

Our Digitization Specifications Have Become the De Facto Library Standard. Over 1,500 global libraries have used our services to digitally preserve, and importantly, make their material accessible. Our partners include 25 of the top 30 largest research and national libraries in North America.

Our Staff is located in 33 Locations, Including 26 Sites in North America. With this geographic footprint, your materials don’t have to travel far if you choose to have it digitized in one of our specialized digitization centers. This also provides opportunities to submit a grant proposal where the content might be located in 2 or 3 different libraries.

Let’s think big and make collections vital for scholarship and cultural heritage available to the world!

Want to know more? Attend the the upcoming webinars for applicants on February 4th and March 4th, 2015 from 2-3pm Eastern Time. (https://clir.adobeconnect.com/_a960001693/hiddencollections/)—looking forward to the resulting conversations, and we hope to see you there!

For more information about working with Internet Archive, contact Robert Miller.

Posted in Books Archive, Hardware, News | 3 Comments