«  Wikiversity? Main McCain campaign appeals YouTube takedowns  »


First Monday:

The Google Books and Open Content Alliance (OCA) initiatives have become the poster children of the access digitization revolution. With their sights firmly set on creating digital copies of millions upon millions of books and making them available to the world for free, the two projects have captured the popular imagination. Yet, such scale comes at a price, and certain sacrifices must be made to achieve this volume. With its greater visibility, most studies have focused on Google Books, addressing limitations of its image and metadata quality. Yet, there has been surprisingly little comparative work of the two endeavors, exploring the relationship between these two peers and their deeper similarities, rather than their obvious surface differences. While the academic community has lauded OCA’s “open” model and condemned the proprietary Google, all is not always as it seems. Upon delving deeper into the underpinnings of both projects, we find Google achieves greater transparency in many regards, while OCA’s operational reality is more proprietary than often thought. Further, significant concerns are raised about the long–term sustainability of the OCA rights model, its metadata management, and its transparency that must be addressed as OCA moves forward.
arrow

Comments (1)

Mary Murrell on October 16, 2008 12:01 AM:

The First Monday article has many factual errors that may lead readers to an inaccurate impression of the Internet Archive and Open Content Alliance contributors. I am a Berkeley anthropology PhD student doing field study at the Internet Archive. I study both the Internet Archive's book project carefully and am aware of much of the public information about Google's project. Whereas a piece to correct the errors is probably called for at some future date, I wanted to bring up just a few to illustrate that there is an overall problem here. Although Kalev Leetaru's article has appeared in a “peer-reviewed journal,” it does not appear to have been adequately researched or even
fact-checked.

First off, Leetaru didn't bother to discover the difference between the Open Content Alliance members and the Internet Archive (they are not synonymous). The author notes that 100,000 books have been scanned by OCA; in fact, OCA has not scanned any books. The Internet Archive has scanned 400,000 (not 100,000, as Leetaru states) and hosts over 500,000, with the additional 100,000 drawn from the Million Books Project, the Gutenberg Project, and other donations by end users. Second, he misstates the technical processes he is analyzing in such a way that it appears he doesn't actually understand them, allowing for misimpression of the accessibility of the Archive’s scanned books. Third, he faults the OCA for not making more of its backend processes available at the same time that he doesn't show due diligence in having properly sought it out—say, a simple phone call to Archive director Brewster Kahle himself, who was not consulted for the article. It is impossible for me to believe that the Internet Archive is less willing to make information about its policies and processes available than Google, as Leetaru claims.

The Archive has built a variety of open source resources and tools, both hardware and software: storage computers known as “petaboxes”; a publicly viewable book scanner (the Scribe), as well as its website OpenLibrary.org, whose entire source code is available to the public. That Leetaru's article didn't even mention OpenLibrary can be taken as evidence that he didn't do the legwork required to accurately represent the herculean efforts of the Internet Archive and the Open Content Alliance contributors to build an open, searchable digital public library.

The Archive is a small non-profit with very limited resources. It lacks a communications department and PR staff, which perhaps would have made Leetaru’s research easier. But this does not mean that it is not an open and transparent operation. An honest and forthright inquiry into the organization would have shown otherwise.

The Archive hosts an open lunch every Friday where it openly discusses its business in front of whomever chooses to join them. Perhaps Leetaru should visit one day and learn all that he does not know about the Internet Archive and the Open Content Alliance.

Post a comment

We had to crank up the spam filter so it may take a little while to appear. Thanks.

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

A book in progress by

Siva Vaidhyanathan

Siva Vaidhyanathan

This blog, the result of a collaboration between myself and the Institute for the Future of the Book, is dedicated to exploring the process of writing a critical interpretation of the actions and intentions behind the cultural behemoth that is Google, Inc. The book will answer three key questions: What does the world look like through the lens of Google?; How is Google's ubiquity affecting the production and dissemination of knowledge?; and how has the corporation altered the rules and practices that govern other companies, institutions, and states? [more]

» Send links, questions and ideas:
siva [at] googlizationofeverything [dot] com

» To reach me for a press query, please write to SIVAMEDIA ut POBOX dut COM

» To reach me for a speaking invitation, please write to SIVASPEAK ut POBOX dut COM

» Visit my main blog: SIVACRACY.NET

» More about me

Topics

Like the Mind of God (57 posts)

All the World's Information (75 posts)

What If Big Ads Don't Work (20 posts)

Don't Be Evil (16 posts)

Is Google a Library? (84 posts)

Challenging Big Media (46 posts)

The Dossier (49 posts)

Global Google (26 posts)

Google Earth (6 posts)

A Public Utility? (37 posts)

About this Book (28 posts)

RSS Feed icon  RSS Feed


Powered by Movable Type 3.35