Description

The Book Commons, overview

The following excerpt is from an article putting the Google project in context. Google aims to digitize the massive collections of the main American academic and public libraries.

“Placing full text book material is not a new idea on the web. Many services, both free and fee-based, allow you to access books online. The longest running such service is Project Gutenberg, founded by Michael Hart in 1971, with over 13,000 books available. I wrote about The Online Books Page forSearchDay last year. This wonderful collection has been online for more than 10 years, and currently provides searchable access to over 20,000 free full text books. The OBP is edited by John Mark Ockerbloom, a digital library planner at the University of Pennsylvania. The Internet Archive is also digitizing books. The goal of the Million Book Project is to "create a free-to-read, searchable digital library the approximate size of the combined libraries at Carnegie Mellon University, and one much bigger than the holdings of any high school library." One publisher that offers a large portion of their new and old material available online, free, searchable, and full image is The National Academy Press. The currently offer access to more than 3000 publications. Two fee-based services include NetLibrary offers access to about 76,000 books with about 1300 new titles added each month. You can access NetLibray books through your local public or university library, often at no charge. ebrary provides access to more than 50,000 titles (books, maps, sheet music, etc). Like NetLibrary, ebrary licenses their service to libraries and educational organizations and users can login and access via any computer with web access, in most cases for free."

Discussion

Tim O'Reilly:

"Having various book search engines competing to build a proprietary online book repository seems silly to me. It also doesn't seem to be working. (For example, a quick scan of Amazon's bestseller list shows only 5 out of the top 25 books "search inside" enabled.)

Book search is a big problem, and it could be solved much faster if the various vendors involved would cooperate rather than compete. Web search demonstrates that there are other grounds for competition than getting a lock on some exclusive body of content. (One might suggest that the race ought to be to be the first company to figure out how to do effective relevance matching for advertising on book search.)

A related issue was also brought out in the Booksquare blog: "...scanning is indeed how Microsoft is getting published works into its database. Even if your work is already in electronic format."

As everyone reader of this blog ought to know, I'm a big fan of the Google library project, which is cutting the Gordian knot of orphaned works for which publishers no longer know the ownership. Scanning makes sense for these books. But it doesn't make sense for books that are already available in some kind of electronic format. The most advanced publishers already have their books in an XML repository, but even the most backwards have at least PDFs that could be searched.

Three things ought to happen to speed up the development of the book search ecosystem:

1. Book search engines ought to search publishers' content repositories, rather than trying to create their own repository for works that are already in electronic format. Search engines should be switchboards, not repositories.

2. Publishers need to stop pretending that "opt in" will capture more than a tiny fraction of the available works. (I estimated that only 4% of books every published are being commercially exploited.)

3. Book search engines that are scanning out of print works in order to create a search index ought to open their archives to their competitors' crawlers, so readers can enjoy a single integrated book search experience. (Don't fight the internet!)" (http://radar.oreilly.com/archives/2006/12/book_search_sho.html)

Status Report 2007

Peter Suber:

"Book-scanning projects grew significantly in 2007. The Google Library Project added the University of Texas at Austin, Princeton University, the Bavarian State Library, the University of Lausanne, Ghent University, Keio University, Cornell University, Columbia University, five Catalonian libraries including the National Library of Catalonia (Biblioteca de Catalunya), and the 12 research institutions in the US Committee on Institutional Cooperation. As in 2006, the legal suspense created by unresolved lawsuits against Google by author and publisher organizations did not stop new institutions from joining the program, although it did cause most of them to limit their participation to public-domain books. Nor did it stop McGraw-Hill, one of the plaintiffs, from adding a Google Book Search box to its Web site.

The Open Content Alliance not only expanded to 80 contributing libraries, but attracted libraries that made a point of saying publicly that they would rather pay their own digitization costs and have the OCA's openness than to have Google pay the costs and restrict the use of the resulting ebooks. Among those were the 19 institutional members of the Boston Library Consortium, including the MIT Libraries. Students at New York University asked their institution to join the OCA instead of Google for the same reason. Objecting that Google restricted use of its scanned public-domain books, Philipp Lenssen, a German programmer, liberated 100 of them by posting them to his free book site, Authorama.

Google and the OCA both enhanced their offerings as well. Google (finally) added plain-text editions to the scanned images of some of its digitized public-domain books and officially revealed its journal backfile digitization project, which quietly launched in 2006. The University of Michigan made its Google-scanned books OAI-compliant. The OCA launched a working demo of its Open Library, and described plans for a wiki-like universal catalog, online annotated bibliographies of its scanned OA books, and a program to digitize and lend orphan works, its first foray beyond public-domain books.

The European Parliament blessed the European Digital Library and urged it to speed up. Project Gutenberg launched PG Canada. The Million Book Project reached the milestone of 1.5 million digitized books, and LibriVox reached the lesser milestone of its 1,000th free online audiobook. Microsoft began digitizing more than 100,000 books from the British Library and another 100,000 from the Yale University Library. The Sloan Foundation gave the Library of Congress $2 million to digitize thousands of rare and brittle public-domain books for open access. A new major book-scanning project joined the existing players when Kirtas Technologies, maker of a book-scanning machine, teamed up with Amazon's print-on-demand (POD) subsidiary, BookSurge, to digitize rare public-domain books and sell POD editions through Amazon.

The beautiful synergy of open access and print-on-demand wasn't born in 2007 but began to spread quickly in 2007. The Kirtas-Amazon book-scanning project is based on the sales of POD editions. The first three universities to take part, Emory, Maine, and Cornell, promised to provide open access to their copies of the digital editions. Rice University Press will not only publish its own monographs in OA/POD editions, but also monographs vetted and approved by Stanford University Press. Hamburg University Press decided that all its scientific publications will be OA/POD, and later agreed to produce OA and POD editions of the works published by Schleswig-Holstein state archive. Six European university and museum presses launched a consortium, OAPEN (Open Access Publishing in European Networks), for publishing OA monographs with POD editions. The Public Domain Books Reprints Service began to sell POD editions of public-domain ebooks from the Internet Archive and Google.

We saw new series of OA books, some with and some without priced, print editions, from the University of Michigan Press (Digital Cultural Books), the University of California Press (FlashPoints), MIT Press (Digital Media and Learning), the Canadian Federation for the Humanities and Social Sciences, the American Museum of Natural History, the Open Knowledge Foundation, and a partnership between Harvard's Berkman Center for Internet and Society and the international Center for Computer Assisted Legal Instruction. The Canadian Library Association will consider open access for all its new monographs, case by case. The Université Libre de Bruxelles decided to provide open access to all its out-of-print books, while Ilmenau Technical University decided, conversely, to provide open access to all its new books. The University of Pittsburgh Press decided to provide open access to all its back titles and open access to all new titles after a maximum two-year moving wall.

Polimetrica released an Open Access Manifesto, the first ever from a book publisher. The American Association of University Presses released a Statement on Open Access, calling on presses to experiment with OA monographs. The UK National E-Books Project started providing free ebooks to UK universities; Pakistan's Higher Education Commission, blending subsidized priced access and open access, started providing free ebooks and ejournals to Pakistani universities.

When Springer announced that more than 29,000 of its books had been indexed by Google Book Search, it made a point of saying that the enhanced visibility boosted sales of its older titles. The first Chinese book publisher to join Google's Publisher program, Cite Publishing Holding Group, explained that it expected Google indexing to increase its sales. Tim O'Reilly published a detailed case study (O'Reilly Radar, June 1, 2007) of how the OA edition of an O'Reilly title affected the sales of the print edition. Eric Von Hippel explained to an interviewer (MIT Libraries News, April 9, 2007) how the OA editions of two of his books increased sales of the print editions. Publishers of the novels nominated for this year's Man Booker Prize considered a proposal to publish OA editions of the nominated books." (http://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;view=text;rgn=main;idno=3336451.0011.110)

More Information

The Top Ten Places to Get Free Books Online, at http://www.friedbeef.com/2007/04/02/top-10-best-places-to-get-free-books-part-1/

Top 20 Places for Free eBooks, http://gilygily.com/free-ebooks-online-20-best-websites-to-download-ebooks-for-free.html . See the comments where about 80 people add their own suggestions.

Sharing Directory for books & writing , http://meshing.it/categories/2-books-writing

More information also at:

50 places to read free books, [1]
The Online Books page, http://digital.library.upenn.edu/books/
Netlibrary, http://legacy.netlibrary.com/about_us/company_info/index.asp
Million Book Project, http://www.archive.org/texts/collection.php?collection=millionbooks&PHPSESSID=45464c8f5c3a66d010a78ff7efe0c5c8; #Project Gutenberg, http://www.gutenberg.org/
Open Source Books, http://www.archive.org/texts/collection.php?collection=opensource
Open Library [2]

Check our entry on Book Swapping sites.

Book Commons

Contents

Description

Discussion

Status Report 2007

More Information

Navigation menu