Everything is Miscellaneous | [2b2k][everythingismisc]“Big data for books”: Harvard puts metadata for 12M library items into the public domain

[2b2k][everythingismisc]“Big data for books”: Harvard puts metadata for 12M library items into the public domain

April 24th, 2012 by davidw

(Here’s a version of the text of a submission I just made to BoingBong through their “Submitterator”)

Harvard University has today put into the public domain (CC0) full bibliographic information about virtually all the 12M works in its 73 libraries. This is (I believe) the largest and most comprehensive such contribution. The metadata, in the standard MARC21 format, is available for bulk download from Harvard. The University also provided the data to the Digital Public Library of America’s prototype platform for programmatic access via an API. The aim is to make rich data about this cultural heritage openly available to the Web ecosystem so that developers can innovate, and so that other sites can draw upon it.

This is part of Harvard’s new Open Metadata policy which is VERY COOL.

Speaking for myself (see disclosure), I think this is a big deal. Library metadata has been jammed up by licenses and fear. Not only does this make accessible a very high percentage of the most consulted library items, I hope it will help break the floodgates.

(Disclosures: 1. I work in the Harvard Library and have been a very minor player in this process. The credit goes to the Harvard Library’s leaders and the Office of Scholarly Communication, who made this happen. Also: Robin Wendler. (next day:) Also, John Palfrey who initiated this entire thing. 2. I am the interim head of the DPLA prototype platform development team. So, yeah, I’m conflicted out the wazoo on this. But my wazoo and all the rest of me is very very happy today.)

Finally, note that Harvard asks that you respect community norms, including attributing the source of the metadata as appropriate. This holds as well for the data that comes from the OCLC, which is a valuable part of this collection.

1 Comment »

One Response to “[2b2k][everythingismisc]“Big data for books”: Harvard puts metadata for 12M library items into the public domain”

on 08 Dec 2012 at 9:24 am1 Trident University

It’s incredible that everyone can access Harvard’s library available to the public. That is a lot of knowledge and information for anyone who is seeking knowledge.

http://www.trident.edu/why-trident

[2b2k][everythingismisc]“Big data for books”: Harvard puts metadata for 12M library items into the public domain

One Response to “[2b2k][everythingismisc]“Big data for books”: Harvard puts metadata for 12M library items into the public domain”

Tags

Sites to See

Archives

Pages