The Center for Intelligent Information Retrieval at UMass Amherst, the Perseus Digital Library Project at Tufts, and the Internet Archive are investigating large-scale information extraction and retrieval technologies for digitized book collections. The NSF has awarded a grant of $2.7 million for a project to apply advanced OCR, topic modeling and metadata extraction techniques to over one million books at the Internet Archive.
- Andrew Jackson on Free “404: File Not Found” Handler for Webmasters to Improve User Experience
- Lee on Presetting metadata with the new Beta Uploader
- You Can Now Play Vintage Console Games on the Web on A Second Christmas Morning: The Console Living Room
- Internet Archive : plus d’un million de torrents légaux partagés | ichezia | Just another WordPress site on Over 1,000,000 Torrents of Downloadable Books, Music, and Movies
- buy research paper on Wayback Machine: Now with 240,000,000,000 URLs