Post-Digital Humanities 2015 updates

Attending the Digital Humanities conference in Sydney this year was tremendously useful. Not only did I learn about several new DH linked open data projects that I might want to interface with in the future (or that are excellent models for the content that I ought to be producing on this site); I also came away with a goal for next year: have Visible Prices at a point where I can propose a pre-conference workshop for DH 2016 in Krakow. Right now, I anticipate the proposal looking a bit like the Jane-athon, but also (or alternatively) being a session where by working on and contributing to Visible Prices, participants will have the opportunity to learn about working with linked open data for DH projects, and take away a basic but solid understanding of:

  • how triples and graph databases work
  • major ontologies and vocabularies that DH projects would most likely utilize
  • what a specialized topic ontology looks like and involves

To that end, I’m pleased to note that Jon and I have an almost complete tool chain for VP: a GoogleDocs spreadsheet that feeds into a BRAT site where a select group of users will be able to mark up the text (example; ETA: here’s a quick video of what that mark up process looks like) in order to identify the object(s) being priced and the prices (sometimes there are many!), and a script that transforms that markup into rdf and feeds it into a basic user interface where users will be able to query the data, but also add keywords and help normalize the currency value (i.e., tell the computer that 3s. 6d. = 3 shillings 6 pence). More on both the keywords and the currency normalization soon — but they’re a vital component of this prototype, both for dealing with the complexity of how price expressions are written, and for expanding beyond Anglo-Saxon pounds to include other currencies.

We’ll have several pre-baked SPARQL queries that people can put in or adapt; more experienced SPARQLers will of course be able to accomplish even more. I’ll have a page where it’s possible to share useful SPARQL queries so that as people write them, others can use them.

Once we pass the initial testing of the set-up, I’ll link the Google spreadsheet to a survey that will allow outside users to contribute prices that they’ve found.

There will still be more work to do: for example, I only recently learned about PeriodO — a linked open data gazetteer of period assertions, and I absolutely plan to incorporate its data into VP. Thus far, I’ve kept it simple — relying only on the date of publication as an chronological marker. That isn’t ideal, however, and PeriodO looks like precisely what I need.

Another major area of work will be the task of acquiring data in bulk from large online archives. More on that later — but the coming tool chain will be a huge advance on either of the two prototypes I built previously, and I’m so pleased that after being on the back burner behind my dissertation, the Demystifying workshops, and then my teaching/consulting work here at the Sherman Centre, I’m finally able to bring VP to the foreground.


