«  Google, MS, and the fight for White Space in the Spectrum Main "... it's a Google."  »


Scan and Release: Digitizing the Boston Public Library | Everything is Miscellaneous



... Of this abundance, the digital group has so far scanned about 24,000 objects. When I point out to Maura Marx, the group’s head, that, given the library’s estimate that it has maybe 23 million objects, she’s looking at a 2,000 year project, she tells me that they’re just getting started. They’re going to bulk up, maybe do some offsite digitizing, and begin to make some serious progress. When I ask Thomas Blake, who does the actual digitizing, how he decides which stuff to do, he laughs a little and says, “What I think is cool.” And, since the public has an appetite for “choochoo trains, maps and postcards,” he’s done a bunch of them. The BPL is, after all, a public institution that both serves the public and relies upon the public’s support. ...


... The scanning is slow because it’s one guy who’s doing a careful job. The camera has a 22 megapixel chip, but they’ve been known to digitize at 88mps, creating files that are half a gig in size. Tom likes saving the RAW files to avoid unnecessary data loss. You never know what’s going to be useful. For example, he had been scanning postcards at 300 dpi, but a curator pointed out that then you couldn’t see the dotscreen pattern, which might be of interest to someone. So now Tom scans them at 600dpi. Overall, they have about 1.5 terabytes of stored images.

The metadata is a whole ‘nother issue. Chrissy Watkins, who has been there for four days — she had been at the JFK Presidential Library — is working on it. For now, Tom gives every item an arbitrary and unique ID number, the key piece of any metadata scheme. But the BPL is facing the inevitable conundrum: Maximize the metadata but slow the process, or do grave less metadata but go at a far faster clip. The group seems to be leaning toward the latter, which makes sense to me. They’ve been using what Tom calls the “Curator Core,” a reference to the Dublin Core metadata standard for books. Trying to capture everything that might be useful is a task beyond daunting. For example, Michael Klein points to “fore-edge paintings,” paintings done on the edges of a book that are revealed when you fan the book slightly. Does the BPL have to come up with a standard that includes whether you fan the book to the left or right? There are so many different types of objects that building a standard or an ontology that captures them all would absorb all of the team’s time. (”The special case is not as special as you’d think,” says Michael.) Instead, they need to scan scan scan, and capture some reasonable set of metadata, to which more metadata can accrete.

“We’re going from collect and hide to scan and release,” says Tom. And in so doing, they’re going not just from no value to some value. They are in fact radically multiplying the value of the Boston Public Library’s holdings. And as we the recipients of this gift incorporate the images, adding information to them, and contextualizing them, we are further enriching the holdings, far beyond what any small group, no matter how intrepid, could manage.

David offers some amazing details and photos in this post.

arrow

Post a comment

We had to crank up the spam filter so it may take a little while to appear. Thanks.

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

A book in progress by

Siva Vaidhyanathan

Siva Vaidhyanathan

This blog, the result of a collaboration between myself and the Institute for the Future of the Book, is dedicated to exploring the process of writing a critical interpretation of the actions and intentions behind the cultural behemoth that is Google, Inc. The book will answer three key questions: What does the world look like through the lens of Google?; How is Google's ubiquity affecting the production and dissemination of knowledge?; and how has the corporation altered the rules and practices that govern other companies, institutions, and states? [more]

» Send links, questions and ideas:
siva [at] googlizationofeverything [dot] com

» To reach me for a press query, please write to SIVAMEDIA ut POBOX dut COM

» To reach me for a speaking invitation, please write to SIVASPEAK ut POBOX dut COM

» Visit my main blog: SIVACRACY.NET

» More about me

Topics

Like the Mind of God (57 posts)

All the World's Information (75 posts)

What If Big Ads Don't Work (20 posts)

Don't Be Evil (16 posts)

Is Google a Library? (84 posts)

Challenging Big Media (46 posts)

The Dossier (49 posts)

Global Google (26 posts)

Google Earth (6 posts)

A Public Utility? (37 posts)

About this Book (28 posts)

RSS Feed icon  RSS Feed


Powered by Movable Type 3.35