Free your metadata
Google Refine: Cleans up Metadata
General Introduction to metadata:
Metadata describes objects
Such as narrative on the content, tags, keywords (automatized by opencalais)
Cultural Heritage Institutions:
Marc (libraries), VRA/CDWA (museum), EAD (archives) …
cp.p for standards: elings/waibel 2008
Ever since the web got popular, people have tried to create metadata to enhance search and retrieval.
But due to economical pressure –> spamming
Recent initiative –> schema.org (google, microsoft, etc.)
Why should you care about Metadata
–> Get more value out of your work –> Let third parties re-use your metadata
–> Hook up with the linked data cloud
Quality is fundamentally relative
Meta refers to the notion of change
Hermeneutical process allowing a deeper understanding of your metadata!
Powerhouse Museum Collection
How Google refine works:
It is a software free to download and install.
You can get tabular view of the metadata: object ID, object number. You can apply facets, and get an overview of most used concepts.
The filters allow to aggregate different values which may seem different for the computer but which have a similar signification.
Ex. on the names of the affiliations in publications. You can standardize variations of the value e.g. “université libre de Brussels”.
Google RDF extension help you to standardize your vocabulary with the help of large available controlled vocabularies (LCSH, Rameau, etc.)
Alignment projects between large vocabularies : MACS, STITCH
Using linked data allows to enrich own data.
Powerhouse museum database in Sydney aims to create an automatic match between LCSH and Rameau supported by BNF.
Huge dispersion of keywords for describing the resources in humanities.
Tables from Google refine can be exported to excel. Power distribution graphs can be created then.
Demonstration of interest of rich metadata in dodis.ch (database of Diplomatic Documents of Switzerland = Diplomatische Dokumente der Schweiz)
search example: Bankgeheimnis
This is a database with high quality and rich metadata.
Problems with people’s names in foreign language.
Open metadata implies the possibility to download the metadata from the databases to be sorted again and analyzed by other applications.
What is the use of replica of databases ?
Crowd sourcing: a way to enrich metadata ?
You have to open up collections and databases to have people interacting.
LONSEA – Searching the Globe through the lenses of the League of Nations