This is too American in its voice and perspective. Anyone have advice for me?
4.1 The Googlization of Universities
The Tangled Relationship
The relationship between Google and the universities of the world is more than close. It is uncomfortably familial. In recent years Google has moved to establish, embellish, or replace many core university services such as library databases, search interfaces and email servers. Google's server space and computing power have opened up new avenues for academic research. One experiment, Google Scholar, has allowed non-scholars to discover academic research they might never have stumbled upon. And Google Book Search has radically transformed the both the vision and daily practices of university libraries. Through its voracious efforts to include more of everything under its brand, Google has fostered a more seamless, democratized global, cosmopolitan information ecosystem yet contributed to the steady commercialization of higher education and erosion of standards of information quality.
All of this occurred at a time when cost pressures on universities and the students they serve have spiked and public support for universities has waned. Google has capitalized on a "public failure." In contrast to a "market failure," a "public failure" is the phenomenon in which an erosion or retreat of state commitment and resources to a public good or need reveals an opportunity for an ambitious firm to assume control of a service and fold it into a market advantage. Understandably, the ubiquity of Google on campus has generated both opportunity and anxiety. Unfortunately, universities have allowed Google to take the lead and set the terms of the relationship. This chapter argues for a reversal of that trend. Universities must assert their values and interests on Google as the company assumes greater control over many aspects of information distribution.
A Common Culture
Universities gave birth to Google. So there is a strong cultural affinity between Google corporate culture and that of academia. Founders Sergey Brin and Larry Page met each other while pursuing Ph.D.s in computer science at Stanford University. Page did his undergraduate work at the University of Michigan and retains strong ties with that institution. Many of the most visionary Google employees, such as University of California at Berkeley economist Hal Varian, suspended successful academic careers to join the company. The foundational concept behind Google Web Search, PageRank, emerged from an academic paper that Brin and Page authored together and published in 1999 . So it's not surprising that Google's corporate culture reflects much of the best of academic work life: unstructured work time, horizontal management structures, multidirectional information feedback flows, an altruistic sense of "mission, recreation and physical activity integrated centrally into the "campus", and an alarmingly relaxed dress code. For decades American universities have been instructed to "behave more like businesses." In the case of Google, we see a stunningly successful firm behaving as much like a university as it can afford to behave.
Peer review -- the notion that every idea, work, or proposition is contingent, incomplete, and subject to criticism and revision - is the core value that Google incorporated from academia. This devotion to peer review is not particular to Google. All open source or free software projects and much of the proprietary software industry owe their creative successes and quality control systems to the practices of peer review. In fact, the entire Internet is built on technologies that emerged from peer-review processes. But Google, much more than the other major firms engaged in widespread and public distribution of software and information, owes its very existence to an explicit embrace of peer review.
Google owes its success to the dominance of its Web search engine and the ability of the company to run simple auctions to place paid advertising spots along the side of seemingly organically generated search results. When you type in "shoe store" to a Google search box, Google's PageRank algorithm sorts through Web pages that contain the phrase "shoe store" and ranks them for you based on the number of other pages that link to those pages. PageRank weights some sources of incoming links higher than others. The result, which takes mere seconds, is a stark list of sources based on relative popularity. Popularity stands in for quality assessment. This is not merely a vulgar, market-based value at work. The same principle guides academic citation review systems. Google's founders where working on citation analysis projects when they came up with the idea of applying such a system to the chaos that was the World Wide Web.
Nonetheless, bibliometrics turned out to be a highly effective method of filtering and presenting Web search results. As Harvard Law professor Yochai Benkler explains, Google assumed the role as the market leader among search engines by outsourcing editorial judgment to the larger collective of Web authors (or, as Benkler puts it, "peer producers") . Back in the late 20th century, when every other search engine used some combination of embedded advertising (site owners paid for good placement within searches) and "expert" judgment (search engine staff determined whether a site were worthy of inclusion in the index):
The engine treats links from other websites pointing to a given website as votes of confidence. Whenever one person's page links to another page, that person has stated quite explicitly that the linked page is worth a visit. Google's search engine counts these links as votes of confidence in the quality of that page as compared to other pages that fit the basic search algorithm. Pages that themselves are heavily linked-to count as more important votes of confidence, so if a highly linked-to site links to a given page, that vote counts for more than if an obscure site links to it. By doing this, Google harnessed the distributed judgments of many users, with each judgment created as a by-product of making his or her own site useful, to produce a highly valuable relevance and accreditation algorithm.
Of course, the principle of "bibliometrics," or determining the value of a work by its echoes in others' citations, is a controversial and troublesome topic within academic culture. Widely used in the sciences for decades, the expansion of bibliometrics to measure the presumed "impact" or "value" of scholarship within the humanities has generated widespread criticism, as much of the best work is published within books rather than a stable set of indexible journals.
The inclusion of peer review in the corporate culture of Google need not have come directly from university life. It could have just as easily come from another field that shares a common ancestor with Google: the Free and Open Source software world. Applications that have emerged from widespread, multi-author, collaborative environments have reshaped every element of the information creation and dissemination process. Almost all email systems, most Web servers, and an increasing number of Web browsers and computer operating systems were built without proprietary claims or controls. Free and Open Source software projects and innovators have promoted an ideology of open flows, constant peer review, and general freedom within a commercial structure that allows for remuneration for services rendered rather than computer code delivered. The fact that many of the early innovators of Free and Open Source software emerged from academia as well explains the ideological continuity among academic computer science departments, many profitable software firms, powerful amateur communities that build and maintain the Internet and the World Wide Web, and Google itself.
The Googlization of Students
Paradoxically, the very reliance on the principles of peer review within Google and reliance on the principles of peer review in the Google PageRank algorithm have undermined an appreciation for distinctions among information sources - at least among university students. According to a summary of two user studies conducted among students in the United Kingdom, commercial Internet search services dominate students' information-seeking strategies. The studies found that 45 percent of students choose Google as their prime search technology. The university library catalogue was the first choice of only 10 percent of students. Students reported that "ease of use" was their chief justification for choosing a Web search engine over more stable and refined search technologies. But they also expressed satisfaction with the results of the searches done with Google and other major search engines. The studies results are not surprising. There is one particular conclusion that should trouble anyone concerned about the influence of Google on the information skills of university students: "Students' use of [search engines] now influences their perception and expectations of other electronic resources." In other words, if higher-quality search resources and collections to not replicate the reductive simplicity and cleanliness of Google's interface, they are unlikely to attract students in the first place and are sure to frustrate those students who do stumble upon them.
A relatively early study from 2002 conducted for the Pew Internet and American Life project found that "Nearly three-quarters (73%) of college students said they use the Internet more than the library, while only 9% said they use the library more than the Internet for information searching." This is a confusing way to phrase and frame the question, however, because even at the turn of the century most academic libraries offered online access to library resources (especially journals) via "the Internet." So it sets up a false distinction. Since 2004, in fact, many libraries have offered direct links from Google Scholar to their library collections to facilitate access when connected to a university network. So the notions of "library" and "Internet" have merged significantly for university students in the United States.
The shift toward Google as a first and last stop in research may not be as universal as we assume. A contrasting set of results came from a study of student research behavior at St. Mary's College in California. This study, published in 2007, showed that "A majority of students began their research by consulting course readings or the library's Web site for online access to scholarly journals. To a lesser extent, students used Yahoo!, Google, and Wikipedia as first steps." In addition, the study found that students found bibliographies and other aggregated or subject-based research resources the most fruitful places to start. Overall, students at St. Mary's were significantly challenged by research assignments, and considered themselves frustrated by unclear expectations and an inability to discriminate among sources for quality and relevance. "A majority of students were not as reliant on search engines, as prior research studies have suggested," wrote the study's author, Alison Head. "Only about one in 10 students in our survey reported using to Yahoo! or Google first when conducting research. Only two in 10 students in our survey used search engines as a second step." What's clear from all of these studies is that students need a tremendous amount of guidance through the information ecosystem and universities are not yet providing them the tools. Whether students start from course materials, Wikipedia, or Google, they need to know where to go next and why.
In her substantial argument for better information literacy, The University of Google: Education in the Post-Information Age, Tara Brabazon of the University of Brighton (UK) offers some stories of her students' research habits. "Google, and its naturalized mode of searching, encourages bad behavior," she writes. Brabazon explains that the seductive power of Google - its perceived comprehensiveness and authoritativeness - fools students into thinking that a clumsily crafted text search that yields a healthy number of results qualifies as sufficient research. Even if Google links students to millions of documents heretofore inaccessible, it does nothing to teach them how to use the information they discover or even distinguish between the true or false, dependable or sketchy, and polemical or analytical. Because simple Web searches favor simple (and well-established) Web sites, students are unlikely to discover peer-reviewed scholarship unless they actively click over to the obscure Google Scholar service. And even then, they must hope that they have the institutional affiliation to acquire the articles they find. Brabazon criticizes these practices as an expression of a particular form of literacy - "operational literacy," which encourages students to be "code breakers" of complex, multimedia works -- yet fails to consider other important modes of literacy such as "critical literacy," or the ability to judge and distinguish among pieces of information and assemble them as new coherent works. Brabazon concludes that universities should not embrace the ideology of "access" and "findability" so uncritically, but should supplement the ubiquitous power of Google with curricular changes that emphasize the skills of critical literacy. "Critical literacy remains an intervention, signaling more than a decoding of text or a compliant reading of an ideologues rantings," Brabazon writes. "The aim is to create cycles of reflection." The production of sound arguments, interpretations, and analyses has become more of a challenge in the age or constant connectivity and information torrents. There is no reason to believe that Google will recede in importance in students' lives. Nor is there any reason to celebrate Google's pervasive influence as an unadulterated boon to the process of learning. There is much work to be done to both understand what this new information menu offers students (and the rest of us). Therefore, we must generate more effective strategies to live better in the new environment.
The Googlization of Scholarship
Google Scholar is an interesting side project for the company. Released in 2004, it serves as a broad but shallow access point to a range of academic work. Google convinced hundreds of suppliers of electronic scholarly resources to open their indexes up to Google's "spiders," so the articles could be scanned, copied, and included in Google's index. The publishers benefit from enhanced exposure to their articles to reading communities beyond academia (and, within academia if some institutions lack contracted access to the home-grown search engines). It does something that no other search engine of academic resources does: It offers links to works in areas as diverse as materials science, biophysics, computer science, law, literature, and library science as results of the same keyword search (for instance, "Vaidhyanathan," as there are Vaidhyanathans publishing in all of these areas). However, it has been constructed with Google's usual high level of opacity and without serious consideration of the needs and opinions of academic librarians. The major criticisms that echo from the library community include the lack of transparency about how the engine ranks and sorts works, the fact that collections are uneven and results undependable, and that the search interface lacks the granular detail that librarians and scholars often demand to find the precise article they need. As with most of Google's services, the greatest and most interesting strengths of the service - its breadth of coverage and ease of use - generate its greatest flaws - lack of depth and precision. So the service is clearly a boon to students and lay researchers but of limited utility to scholars. One study of Google Scholar's collection and service discovered that the service was almost a full year behind indexing works published in the leading PubMed collection and concluded that "no serious researcher interested in current medical information or practice excellence should rely on Google Scholar for up-to-date information." Because of North American publishers have been most aggressive at including their works within Google Scholar (or, perhaps, because Google has been most aggressive attracting North American publishers), many works in languages beyond English fail to show up on the first few pages of Google Scholar searches. German literature and social science work, for instance, suffers greatly if one uses Google Scholar as the chief research tool.
As more journals move online, research and citation behavior changes as well. A study published in Science in 2008 demonstrated that as more journals came online between 1998 and 2005, scientific literature as a whole cited fewer and newer sources. In other words, forcing scientists to peruse bound volumes of old journals encouraged serendipity and a deeper acknowledgement of long-term debates within fields. Thus researchers are more likely to echo prevailing consensus and narrow the imagination on which research relies.
Google's enhancement of this phenomenon only serves to intensify the problem. The mystery of why one particular paper should appear above another paper in Google Scholar searches does not help. Google's "about Google Scholar" site explains that "Google Scholar aims to sort articles the way researchers do, weighing the full text of each article, the author, the publication in which the article appears, and how often the piece has been cited in other scholarly literature. The most relevant results will always appear on the first page." This fails to explain much. The principle at work certainly biases science and technology works above those in the social sciences and humanities, as the lattice of article citations makes up a more solid structure within the sciences than it does the humanities (where much of the most influential work appears in books). Secondly, citation counts do not indicate absolute value, even in the sciences. A high number of citations might indicate that an article stands as prevailing wisdom or consensus within a field, and thus serve as foundational. Or, just as likely, a high citation could might indicate that an article is suspect and open to question. These are not equal values. Ranking such articles as if their citations are the result of the same intellectual process is troublesome. In addition, because Google Scholar operates by the use of full-text indexing and searching, results are likely to come from divergent collections and field. A search for "human genome project" yields a large number of meta-scholarly articles: works that describe or analyze the human genome project from a variety of perspectives. They are all from major figures in the field, such as James Watson and Frances Collins. But the first page of results does not yield articles of actual science done using the human genome database. For that, one must search a specific term or gene. A search for "whale oil" could yield results from agriculture journals, ecology journals, or an article about Herman Melville's Moby Dick.
While studies comparing Google Scholar to other commercially available search indexes for scholarly material consistently demonstrate the inadequacies of Google Scholar, it's clear that Google is not going anywhere but front-and-center among both faculty and students. This makes information assessment skills more important than ever. As Google Scholar ranks serve as proxies for citation analysis to assess impact of scholars on their fields, Google might have a direct affect on the future employment of tenure-track researchers. Google Scholar therefore makes the role of librarian central to and more visible within every part of the academic mission. Paradoxically, the more we use Google Scholar, the more we need librarians to help us stumble through the fog of data and scholarship that it offers.
The Googlization of Book Learning
Google Scholar is a clever experiment and a value-added feature that has helped democratize specialized information for a broader readership. But Google Book Search is a monster of a project that has radically altered the roles and scopes of both publishing and librarianship. Since 2004, Google has been scanning in millions of volumes of books from academic libraries around the world. To do this, Google chose to make copies of copyright-protected books without the permission of copyright holders - a potentially massive number of cases of willful infringement. In late 2008 Google reached settlements in lawsuits brought by the American Publishers' Association and the Authors' Guild. The terms of that settlement not only absolved Google of the potential liability for infringement; it gave the company a virtual monopoly on the electronic distribution of many millions of out-of-print yet in-copyright books from the 20th century.
The most important fallout of the Google Book Search settlement is that it leaves Google as the only viable player left in the book-scanning game. Since the 1980s academic libraries have been participating in ad-hoc efforts to scan, preserve, and open up their collections of books to a wider readership. Lately, Microsoft and Yahoo had been helping a not-for-profit venture, the Open Content Alliance, scan books from a small number of academic libraries (although Microsoft withdrew support in 2008). Once Google came into the race in 2004 with its financial commitment exceeded only by its ambition, it has been hard, if not impossible, to argue for a diverse array of participants. After the settlement, in which Google effectively set the price for royalty distribution to copyright holders for books downloaded from the system, Google stands alone.
The specific effects on universities are twofold: First, there is now no legal risk in permitting Google to scan in-copyright books in their collections; second, because Google has pledged to place designated "Google Book Search" terminals in public and university libraries across the United States, many libraries that never had the funds or space to build large collections of works can now enjoy access in electronic form. But the secondary effects to these changes could be significant as well. Many libraries could choose to remove books from their collections if they consider the electronic access via Google to be sufficient. Of greater concern is the fact that every library in the United States will soon have an electronic book vending machine, run by and for Google, operating in the midst of an otherwise non-commercial space. Every library will soon be a bookstore. The steady commercialization of academia is not a new story. But it remains a troubling one. The invitation of Google into the republican space of the library directly challenges the core purpose of a library: to act as an "information commons" for the community in which it operates.
The Googlization of Research
Google's major advantage over almost every other information firm in the world is their massive server space and computing power at the company's disposal. The scale of Google's infrastructure is a company secret. But it's no secret that its willingness to give each GMail user two gigabytes of server space to store email archives is some indication that Google's server farms are formidably and likely of historic proportions. Google's remote storage space is large enough and its computing power fast enough to host and contribute to some massive research projects done in conjunction with academics. In October 2007 Google joined with IBM to establish a server farm devoted to research projects that demand both huge data sets and fast processors - expensive ventures for universities themselves. The University of Washington signed up to have the first computer science department to use the Google-IBM resources. Washington was soon joined by Carnegie Mellon University, the Massachusetts Institute of Technology, Stanford University, the University of California at Berkeley, and the University of Maryland. Researchers at Washington are using servers equipped with suites of open-source software to run complex analyses of Web-posting spam and geographical tagging. In March 2008, the National Science Foundation agreed to vet research proposals for projects that would employ the Google-IBM service.
The benefits to researchers and their universities are clear: no single university can afford the servers and processors it would take to do this sort of scientific analysis. By computing in "the cloud," a set of distant servers accessible through inexpensive personal computers connected through Internet-like networks, researchers from around the globe can collaborate and coordinate their efforts. More big science can be done faster and cheaper if Google, IBM, and universities can combine their brain and computing power.
The benefits to Google and IBM are clear as well: many of the computational problems academic researchers hope to solve happen to be the ones that these two companies would like to solve. This project gives them easy access to the body of knowledge researchers generate while using these systems. In keeping with Google's traditions and values, nothing about this project seems to indicate that Google claims exclusive rights to work done with its help. However, university officials who negotiate contracts with Google often must sign non-disclosure agreements to ensure that Google's competitors do not have too clear a picture what the company is doing with its academic partnerships.
Computing in "the cloud" is both radically empowering and quite concerning. One downside the tangle of rights claims that a widespread collaboration among individual researchers, university technology-transfer offices, and two or more major computer companies can generate. Such a confusing, complicated set of claims not only risks years of litigation among the parties, it could attract significant anti-trust scrutiny as well.
Cloud computing and massive, distributed computation has already been declared the next great intellectual revolution by the magazine that generates such hyperbole with impressive regularity. Wired magazine editor Chris Anderson wrote in June 2008 that the ability to collect and analyze almost unimaginable collections of data renders the standard scientific process of hypothesis-data collection-testing-revision-publication-revision almost obsolete. Anderson wrote:
Sixty years ago, digital computers made information readable. Twenty years ago, the Internet made it reachable. Ten years ago, the first search engine crawlers made it a single database. Now Google and like-minded companies are sifting through the most measured age in history, treating this massive corpus as a laboratory of the human condition. They are the children of the Petabyte Age. The Petabyte Age is different because more is different. Kilobytes were stored on floppy disks. Megabytes were stored on hard disks. Terabytes were stored in disk arrays. Petabytes are stored in the cloud. As we moved along that progression, we went from the folder analogy to the file cabinet analogy to the library analogy to -- well, at petabytes we ran out of organizational analogies. At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics. It calls for an entirely different approach, one that requires us to lose the tether of data as something that can be visualized in its totality. It forces us to view data mathematically first and establish a context for it later. For instance, Google conquered the advertising world with nothing more than applied mathematics. It didn't pretend to know anything about the culture and conventions of advertising -- it just assumed that better data, with better analytical tools, would win the day. And Google was right.
Needless to say, Anderson's techno-fundamentalist hyperbole belies a vested interest he has in the narrative of the revolutionary and transformational power of computing. But here Anderson has stepped out even beyond the pop sociology and economics that usually dominates the magazine. Anderson claims that "correlation is enough." In other words, the entire process of generating scientific (or, for that matter, socially scientific) theories and modestly limiting claims to correlation sans causation is obsolete and quaint because, Anderson argues, given enough data and enough computing power, you can draw strong enough correlations to confidently claim you have discovered knowledge.
The risk here is more than one of intellectual hubris. The academy does not have a dearth of that. Given the passions and promotion of such computational models for science of all types, we run the risk of diverting precious research funding and initiatives away from the hard, expensive, plodding laboratory science that has worked so brilliantly for three centuries. Already, major university administrations are pushing to shift resources away from lab space and toward server space. The knowledge generated by massive servers and powerful computers will certainly be significant and valuable - potentially revolutionary. But it should not come at the expense of tried-and-true methods of discovery that lack the sexiness of support from Google and an endorsement from Wired.
How should Universities manage Google?
So far, Google has been calling the shots. Every few months, it seems, the company approaches universities with a new initiative that promises stunning returns for the academic equivalent of "no money down." Since 2006 Google has been competing with Microsoft and Yahoo to take over university email services, thus locking in students as lifetime email users and allowing the company to mine the content of emails for clues about consumer preferences and techniques for targeting advertisements. The potential of relieving universities of the cost of running email servers and limiting the storage space of their users to a few megabytes is almost too attractive to pass up. But we should be wary. We should not let one rich, powerful company set the research and spending agenda for the academy at large simply because we - unlike Google - are strapped for cash. The long-term costs and benefits should dominate the conversation. We should not jump at the promise of quick returns or even quick relief. The story of Google's relationship with universities is not unlike the tragedy of Oedipus Rex. Since its birth Google, overflowing with pride, has been seducing its alma mater - the American Academy. If Google is the lens through which we see the world, we all might be cursed to wander the Earth, blinded by ambition.
Battelle, John. The Search : How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture. New York: Portfolio, 2005.
Benkler, Yochai. The Wealth of Networks : How Social Production Transforms Markets and Freedom. New Haven [Conn.]: Yale University Press, 2006.
Brabazon, Tara. The University of Google : Education in the (Post) Information Age. Aldershot, Hampshire, England ; Burlington, VT: Ashgate, 2007.
Bracha, Oren. "Standing Copyright Law on Its Head? The Googlization of Everything and the Many Faces of Property." Texas Law Review 85 (2007).
Stross, Randall E. Planet Google : One Company's Audacious Plan to Organize Everything We Know. 1st Free Press hbk. ed. New York: Free Press, 2008.
Weber, Steve. The Success of Open Source. Cambridge, MA: Harvard University Press, 2004.