Over at Techdirt Tim Lee assures us that most of the problems that Google Book Search critics have cited will be worked out through the magic of competition. After all, he reminds us, there are other services like the OCA and the contracts with universities do not forbid other digitization efforts:
These are somewhat puzzling concerns to raise at all given that Google has historically been absolutely obsessive about improving the quality of its search results and archiving useful data. But it also ignores a more fundamental point: Michigan, and Google's other library projects, aren't granting Google exclusive access to anything. Under the terms of the Google-Michigan agreement, Google returns each book after scanning it, and Michigan is free to sign up with other scanning projects, including Google's competitors. It's true that Michigan has agreed not to share the Google-created digital files with others. But the important point here is that those files wouldn't exist at all if not for the agreement. It would hardly be reasonable to expect Google to spend tens of millions of dollars to create digital files that would immediately be available to Google's competitors.In short, Google is anything but a monopoly. There are already competing book-scanning efforts under way, and if Google's project is a success we can expect more such efforts to be launched in the future. And because Google isn't a monopoly, it doesn't make sense for universities to treat it like one by trying to micromanage every aspect of the service it ultimately offers. In the unlikely event that Google Book Search turns out to be a lousy product, consumers will punish Google by switching to the competing offerings of Microsoft, Yahoo, or others. It's pointless to try to force Google to produce a high-quality product when its competitors already give it plenty of reasons to do so.
Vaidhyanathan also characterizes the Michigan scanning program as "massive corporate welfare," but this, again, doesn't make a lot of sense. The vast majority of the books Google is scanning spend most of their time sitting on shelves unread. In principle, Google is no different from any other library patron: it checks out books, reads them, and returns them. The only difference is that it's doing it on a much larger scale than a normal library patron would. But there's no evidence that Michigan has been playing favorites. If another company approaches Michigan seeking to scan its books on the same terms, and is turned down, then people would have strong grounds for criticism. But that doesn't appear to have happened. Google's just made the best offer so far. The "corporate welfare" label just doesn't fit.
Tim is a very smart guy. But I am afraid he is being way too optimistic about the market forces at work here -- both in terms of their positive and negative effects.
I have outlined my issues in various places far better suited for subtle arguments than the Web. And I will continue to do so. But let me just summarize some of these here.
Let's be clear here. Michigan and the other libraries control archives worth US$ billions. They don't need Google. Google needs the archives to enlarge its commercial venture. So it's clearly corporate welfare. The libraries are only in this for expediency -- not a core value of librarianship.
In addition, the Michigan contract is one of the few in which the university gets a complete archive of its collection. Most of the libraries in the project get only slivers of their collections back in digital form -- and only when Google decides to give it to them. The most recent contracts stipulate that the libraries may not make available to digital files until Google gives an OK. Google has all the power, yet the libraries have all the resources. This is a very bad deal. All of these restrictions work against the ethics and ideology of public and academic libraries.
That's why other libraries are considering working with OCA. And it's why so many university librarians have real complaints about Google's heavy-handed approach to the project. Google operates in the dark. Libraries are about light.
And as far as exclusivity: there is exclusivity de jure and exclusivity de facto. The competition Tim mentions, such as the OCA, is very small compared to Google. It is not-for-profit. And it is non-exclusive in the best way: its files will be findable by any search engine, not just the market leader. OCA has very little cash to spend on this project. In addition, many if not most in-house and consortia-led library digitization projects, done with the care and comprehensiveness that libraries have demonstrated for 600 years, have all been stopped thanks to Google's "crowding out" effect. Saying OCA is real market competition for Google ignores scale. It ignores the specifics of Google itself.
So Google, by virtue of its size, status, and market power has an effectively exclusive deal. Once a university has committed its staff and space to Google, it is far too expensive to consider a better service. Google has the wealth to make an offer that many libraries can't refuse. In the short-term, and on paper, it seems great. But it's barter: goods for services. And the goods are worth so much more than the service. So in the long term, it's not so good.
Tim also cites what he considers a sterling record of quality improvement in Google's core services. He should recognize , however, that creating a PageRank-type of index for hyperlinked text is easy compared to doing such a thing with documents that lack metadata on the text itself. Full-text search without good metadata is folly. Web search and book search are two completely different projects. Book search is much more challenging and so far a complete failure. Google has no incentive to improve it if no one else offers anything close.
Plus, consider this paradox that I will write about at length soon: Google's ranking themselves increasingly serve as a proxy for quality. So this makes it harder to complain that Google rankings are bad. How can you claim that a book on the fourth page of a Google search is more relevant or more factual than one of the first? After all, the mark of relevance is a high Google rank!
On the Web, at least, we can have faith in the wisdom of the crowd (if you choose to live that way). You can assume that bad pages will drop over time and good pages will rise. That's sort of true. But it's not really true. The myth persists, of course. But in the Book Search, the crowd has no way of expressing preferences. The web of scholarly citation does that for a small number of texts, but not in a way that Google can harness for its larger projects.
Complicate all this with the fact that such blind faith in Google's commitment to quality ignores the basic historical principle that past results do not mean anything to future ventures (see Hume, Popper, or the current real estate market), the clear sense on Wall Street and elsewhere that Google is overextending itself, the fact that Google no longer answers to some egalitarian muse but to shareholders, and the fact that Google does not work for us (and never did). This is about accountability.
Google is a company. It should be acting in its own interest. Libraries serve and answer to the public. Conflating these missions is a grave mistake. I have little patience for arguments that idealize Google's egalitarian mission in the world (which Tim does not do). I don't believe in "corporate responsibility." Corporations should be responsible to their shareholders -- end of list. That's why we need strong unions and appropriate anti-trust and regulatory structures to mitigate negative externalities (like pollution), correct for market failures, and temper the occasional maldistribution of resources.
I have more patience for people who trust that Google still answers to market pressures the way it did on the way up. In this project I hope to convince Tim and others that Google's success this decade has insulated it from much of the market feedback that made it such a brilliant company in the first place.
But more importantly, I hope to convince Tim and others that trusting invisible, proprietary technology to make good judgments about fundamentally human concerns (like which book about Rwanda tells the story best) is a very bad idea.
Consider all that in the light of the fact that Google cannot win in court against these publishers. Plus, snippets are useless research tools!
But to Tim's core argument: Google's market power creates a form of "soft exclusivity" that we should not ignore. Now, once again we could point to all the warnings of Microsoft's market power in the 1990s and Standard Oil's market power in the 1910s. The lesson from them is clear: without fervent criticism, activism, creativity, and legal intervention, the behavior of these market giants would have continued unchecked by themselves or others. Besides, the two richest companies in the world remain Exxon/Mobile (grandchild of Standard Oil) and Microsoft. Linux is great. But to claim it's real competition in a grand sense ignores reality.
Comments (4)
You say: "the Michigan contract is one of the few in which the university gets a complete archive of its collection. Most of the libraries in the project get only slivers of their collections back in digital form."
Are you saying that only part of their collection is being digitized, or that they are only receiving a portion of the digital files? And if only a portion of the collection is being digitized, do you know if that's a library choice or a Google decision? (I've heard that Google is only digitizing 10% of the U of Cal material, but I don't know how or why that decision was made.)
Here are my questions...
Has a university digitization project been stopped or altered as a result of joining the Google project? Has joining the Google project precluded joining other projects?
I've been told both yes and no to the first question from different people at the same university. I have my own ideas about that, but if I get a chance to look at the matter closer, I will. I suspect you have your own observations there. ^_^
Without commenting on whether people are correct or not, I don't believe that all librarians are joining because of expediency per se. They're joining because they believe it's the best way of increasing access to the material. In that sense, maybe, expediency could be considered a value. ^_^
It is expedient to participate, and expediency is a core value of librarianship. Our library strives to make information available to our community as quickly as possible. Google can make a large scale electronic text collection available more quickly than we can do it ourselves.
I can't think of any situations where joining the Google project might preclude a library from joining any other projects other than just not being able to take on more than one project. It's a labor-intensive and resource-intensive effort to participate that requires external or internal underwriting that isn't often available.
Participation in GBS can absolutely cause a Library to rethink its own book digitization strategies. We've been digitizing books for fifteen years at UVA. We're not stopping, but we are going to focus more on our rare and unique volumes. That's a collection development and resource issue -- if our general collections will be made accessible through GBS, we should focus our local efforts on our special collections and on materials where digitization plays a role in preservation.
I cannot deny that a large percentage of volumes are only available in snippet or metadata view. Not as useful as full-text, sure, but not useless. If I can find that a book has something potentially useful for my work through a snippet view where I did not even know that the book existed or that it has any relevant content, then that's more than I knew before. Then InterLibrary Services can get that physical book for me. What's useless about that?
Nice response, Siva. Here's a "magic market" assumption I can't pass up highlighting:
"It would hardly be reasonable to expect Google to spend tens of millions of dollars to create digital files that would immediately be available to Google's competitors."
Um, last I checked Google has no clear "right" to be making these copies. I'm happy they're making them. But maybe part of the price of granting that right should be their willingness to make them available, if not to competitors, at least to the type of public archive that should have been doing this thing in the first place.