Scan and Release: Digitizing the Boston Public Library | Everything is Miscellaneous
... Of this abundance, the digital group has so far scanned about 24,000 objects. When I point out to Maura Marx, the group’s head, that, given the library’s estimate that it has maybe 23 million objects, she’s looking at a 2,000 year project, she tells me that they’re just getting started. They’re going to bulk up, maybe do some offsite digitizing, and begin to make some serious progress. When I ask Thomas Blake, who does the actual digitizing, how he decides which stuff to do, he laughs a little and says, “What I think is cool.” And, since the public has an appetite for “choochoo trains, maps and postcards,” he’s done a bunch of them. The BPL is, after all, a public institution that both serves the public and relies upon the public’s support. ...
... The scanning is slow because it’s one guy who’s doing a careful job. The camera has a 22 megapixel chip, but they’ve been known to digitize at 88mps, creating files that are half a gig in size. Tom likes saving the RAW files to avoid unnecessary data loss. You never know what’s going to be useful. For example, he had been scanning postcards at 300 dpi, but a curator pointed out that then you couldn’t see the dotscreen pattern, which might be of interest to someone. So now Tom scans them at 600dpi. Overall, they have about 1.5 terabytes of stored images.The metadata is a whole ‘nother issue. Chrissy Watkins, who has been there for four days — she had been at the JFK Presidential Library — is working on it. For now, Tom gives every item an arbitrary and unique ID number, the key piece of any metadata scheme. But the BPL is facing the inevitable conundrum: Maximize the metadata but slow the process, or do grave less metadata but go at a far faster clip. The group seems to be leaning toward the latter, which makes sense to me. They’ve been using what Tom calls the “Curator Core,” a reference to the Dublin Core metadata standard for books. Trying to capture everything that might be useful is a task beyond daunting. For example, Michael Klein points to “fore-edge paintings,” paintings done on the edges of a book that are revealed when you fan the book slightly. Does the BPL have to come up with a standard that includes whether you fan the book to the left or right? There are so many different types of objects that building a standard or an ontology that captures them all would absorb all of the team’s time. (”The special case is not as special as you’d think,” says Michael.) Instead, they need to scan scan scan, and capture some reasonable set of metadata, to which more metadata can accrete.
“We’re going from collect and hide to scan and release,” says Tom. And in so doing, they’re going not just from no value to some value. They are in fact radically multiplying the value of the Boston Public Library’s holdings. And as we the recipients of this gift incorporate the images, adding information to them, and contextualizing them, we are further enriching the holdings, far beyond what any small group, no matter how intrepid, could manage.
David offers some amazing details and photos in this post.



