February 2015 update: getting started with brat

Shortly after I wrote my last post here at the end of July, I was offered a post-doc at McMaster University, in the Sherman Centre for Digital Scholarship. Since then, I’ve started that position, and started working with Jon Crump, with funding from the EADH.

Here’s a tiny bit of we’ve been up to so far:

1. Working on some sample data, and marking it up in the brat rapid annotation tool. Brat allows me to identify portions of text from the quotations (which are my basic raw data, since I’m not going to be encoding whole texts) as RDF subjects and objects, and connect them with predicates. The subjects, objects, and predicates are customized for Visible Prices, rather than predefined. (If you’re new to this project, and new to linked open data/RDF/semantic web stuff, then you might find my Scalar technical statement useful)

Here’s what the annotation looks like, and how it’s progressing:

This is the first pass, from a few weeks ago. Notice that while I’ve identified the specific object being sold, and the price, the rest of the quotation is just hanging out, unmarked. That means that we haven’t defined the relationship of the rest of that information (the extraordinary Efficacy, etc.) to the ontology that we’re building. But that was just the first pass.

VP Brat shot 1


Here’s a more recent update, with new markup by Jon. In this instance, we’ve got the whole quotation included (though to be fair, this is a simpler quotation than VP5 above.) Part of my homework is to go through and see about applying this to the other samples. Note: we may end up changing these subjects, objects, and predicates even further.

VP Brat shot 2


Brat seems like a great tool for this project so far. It’s easy to work with, and will export the annotation in RDF format. It’s also occurred to me that it could work well for crowdsourcing annotations, should I decide to make use of that in the future. (I might, but it won’t be in the immediate future. Maybe within a year, I think? I may crowdsource something else, however…).

Besides getting started with brat, I now have my own Github repository for this project, which is very exciting, in part because I set it up myself using the command line, rather than a GUI. (I know my way around a command line, but I’m not too sophisticated to feel pleased when something works exactly as I wanted it to.) My facility with both the command line and with git is due in part to the tutorials at Team Treehouse, which I’ve found to be a) useful, b) wide-ranging in what they cover, and c) packaged in small sections that mean that I can work on them when I have a spare few minutes.

Jon’s done more than get me set up with brat, but I’m going to save that for another post, so that I can get into the habit of writing about this more regularly (and get back in the blogging habit generally.) So: more soon!


