Wednesday, 21 November 2012

Humanities Informatics #ndf2012

Humanities Informatics: <!-- Insert the eResearcher + Information Specialist Here -->
Ingrid Mason (@1n9r1d), Intersect Australia
The Humanities Networked Infrastructure project is a virtual laboratory project funded by the NeCTAR programme in Australia. The project has several significant scholarly humanities datasets to bring together and map across. The immediate goal is to enable researchers to explore and interpret the commonalities.
The initial design challenge is to select description schema and use linked data and controlled vocabularies for data to align the data. This approach tests the assumption that configuring and building on the knowledge of available schema, methods and datasets, will provide a standards based and curated foundation layer to support research requirements.
This 'prefabricated' approach has been the basis by which the digital humanities and GLAM sectors have provided access to data. Observing how researchers shape and use this prefabricated environment will inform the value of that approach and the architectural modelling, and inform next steps to building infrastructure where the 'researcher query' is the lens that defines the schema.

Anonymous quote: "Gah semantic web is frying my brain!"

Intersect Australia is eResearch org. Working on virtual lab project in humanities informatics field. Talking and dreaming and living data... Have become conversant in RDF; even taking step to ontology development. Talking about linked data, data as graph. Interested in overlap between humanities informatics and GLAM digital cultural heritage.

Wants to provoke thinking - datasharing across GLAMs and scholarly datasets? Who has authority, truth, encoding consensus or contradiction? Doing something with HuNI data? etc

Digital Humanities sits within eResearch (which has been dominated by science). HuNI (@hunivl - Humanities Networked Infrastructure) is a distributed project want to explore commonalities/divergences in data. Bring together datasets, meaning dealing with multiple standards, need to build an ontology. User-centred design.

Assumptions - they're "prefabricating" but talking to researchers all the way through. Building foundation layer. Fascinated by idea of a researcher query. Work to help researchers ask the questions they need.

Project to integrate 28 cultural datasets (using linked open data) into a virtual laboratory. Want to break down barriers between disciplines. Want it to be available to all but licensing comes into it.

Data - AusStage, bonza, CAARP, AustLit, CircusOz, Australian Dictionary of Biography, PARADISEC, Australian Women's Register.......
Tools - eg Omeka, Neatline, LORE< OCCAMS, Heurist Interoperability will be key. ExSite9 tool to help researchers who collect multimedia data (image, audio, video, GPS) in field along with own notes which would otherwise go into notebook. Needs to work without wifi available. Outputs: data storage "Corbicula", aggregation, linked data service, RESTful web services, semantic mediation, discovery service, tool provision, collection level descriptions Definition of informatics from Wikipedia: "studying how to design a system that delivers the right information, to the right person in the right place and time, in the right way"

(Skimming through - slides will be online.)

"Data" an ineffective word to describe all the kinds of data there are.

Linked Data on Wikipedia.

RDF - resource description framework. Statements known as "triples" - subject, predicate, object. In different formats eg RDF/XML, RDF/JSON
SPARQL - query language

"Ingrid is a Kiwi. Conal is a Kiwi. But what is a Kiwi?"

Ontologies have concepts, relations, instances, and axioms. A set of entities within a domain are related by a concept.

Connections between people within Australian Biographies, and between a group of datasets.

  • Need to help researchers go from above the forest through the canopy into the trees and branches.
  • Unlock data, value in controlled vocabularies.