Paige Morgan

How to get a digital humanities project off the ground


At DHSI 2014, participants requested an unconference session on how to turn a digital humanities project from an idea into a reality, and I offered to lead it. Here, roughly, are the steps that I recommended. A few are relevant chiefly to graduate students; most are applicable to academics at nearly any level. Some of these are applicable for large projects, more than smaller ones, but many apply to both. My experience comes from my work with two ongoing projects, Visible Prices (VP) (a database development project that I imagined in autumn 2009), and Demystifying Digital Humanities (DMDH) (a DH curriculum development project that I began with my colleague Sarah Kremen-Hicks in 2012).

When I began VP, I was told that it wasn’t possible; this year, I received a small project grant from the European Association for Digital Humanities to keep developing it. DMDH has just been renewed by the UW Simpson Center for the Humanities for a third year, and Sarah and I have been joined by a third collaborator, Brian Gutierrez. (I could hardly ask for better team members to work with.) Everything below I have learned through working on both of these projects.

1. Don’t be too quick to make your project part of your dissertation. There are a number of reasons for this, and factors to consider when you’re deciding. I’ve written about these previously over at Demystifying Digital Humanities. Perhaps most important are the following:

  • It’s risky to set up completing a project (or even a prototype) as a requirement for graduation if you are new to working with computing technology and/or if you don’t have an absolutely clear path to acquiring the skills you need.
  • A digital project can be very useful because it’s something different from your dissertation, and an activity in which you can mess around — but making it an official part of your degree often turns it into something serious; one more thing to feel bad about because it’s on your to-do list.
  • Having a second project to talk about on your research agenda (and in interviews) as something that you’ll be working on in the future is a positive thing: a feature, not a bug.
  • Is there a compelling reason that your project must be a major part of your dissertation? A reason that it needs to be? If so, then you have things to think about; if not, don’t make your life more complicated than it already is.

2. Investigate the legal/copyright/IP issues early on. This applies both to the status of materials that you want to work with if you’re working with someone else’s texts, to your own output (if you’re developing curriculum materials, an app, etc.), and to the software or hardware that you plan to work with (Learn what proprietary software is.). It’s especially important if you’re developing your project with grant/fellowship funds provided by the university. I am not a lawyer, so I want to avoid offering anything that might look like legal advice, other than that you need to be proactive, rather than ignoring these questions. They won’t go away, and they’re less likely to become problems if you’re dealing with them early. Be aware, too, that these issues are not static, and may change during the life of your project.

3. Scour the web to see if other similar projects exist. If someone has already done the thing that you want to do, you need to know about it — just as you make sure that the arguments you make in your essay haven’t already been made. This activity isn’t just due diligence. It’s also a chance for you to develop a rationale for why your project is needed in the form that you imagine it.

Here are three things that might happen:

  • You could find that someone else is doing (or has already done) the thing you imagine doing. In that case, just as when you find an essay that’s making an argument you hoped to make, you try to build on it, and think about what the next step might be.
  • You could find that someone else is doing something similar, but with a fundamentally different approach. (In my case, I found that economic historians had produced huge spreadsheets detailing the different prices of a single commodity (wheat, tallow, oil, etc.). They were interested in purchasing power, but weren’t aggregating prices from wildly different sources.
  • You could find that someone is using the same method that you want to use, but on a different set of texts. In that case, you have the opportunity to use the same method (asking and acknowledging, of course!) on your project. I would anticipate that you’ll still have to adjust the method, just because your text — your data — is probably different from the other project’s data. If you do end up piggy-backing on another project’s methodological approach, that isn’t a bad thing. You still have to do the labor involved in execution, and you will still have to do the work of interpreting and thinking about your results.

4. Talk to people about your project. Write about it. Apply to give lightning talks and/or conference papers about what you want to do. In some ways, though this is point #4, it’s the #1 answer to the question of “how do I get this project off the ground?” The more you find ways to communicate and interact with people regarding your project, the more comfortable you’ll be doing so, and the more that it will become a real thing. Don’t just talk about the things that are working well: talk about the challenges and the grunt work, too. Talking and/or writing about a project is a way of thinking about it, too; and thinking about it is as important as building or doing whatever it is.

Here are just a few of the things that you might talk/write about:

  • What it is you hope to do, in plain terms (both the technical and the critical)
  • What you expect your project to show
  • Why this project needs to be done
  • What made you interested in doing this project in the first place
  • Which people you hope will use it, and in what contexts they will do so
  • Technical challenges (reasons that this project hasn’t happened yet, or couldn’t have happened previously)

5. Figure out what the smallest version of your project is, and start by doing that. If you want to make an archive of images, digitize a single image, and add any keywords, commentary, metadata, etc. Notice the decisions that you have to make, some of which are almost certainly critical choices (Because I had decided that my project was going to include prices from fictional and nonfictional texts, I had to decide which of those categories contained poetry.)

If you’re able to put together a single component without running into any trouble, scale up from there: what’s the next smallest exhibit you can build that you can imagine presenting to an audience? Three items? Five? Do it; and write a basic introduction and rationale for the way you’ve done it, just as though this were something that you were publishing. If there is an appropriate occasion or venue for you to display this project and write/talk about it, do so. If you are comfortable doing so, explain that it is a test version of a larger project that you are pursuing.

5.5. When you’re building these small prototype versions, be easy on yourself. Go for the low-hanging fruit. My Scalar prototype of Visible Prices (built for the DH From the Ground Up panel at MLA 2014) features Charles Dickens’ David Copperfield because Dickens novels contain lots of prices and are familiar to many people. I could have encoded prices from 10 separate (and probably less familiar) novels to make a similar point, but that would have taken time that I didn’t have, and required me to be more eloquent/more adept at explaining why I’d chosen them. My audience was familiar enough with Dickens that they didn’t need any further explanation, which left me free to talk about more interesting aspects of the project.

6. Know that the platform or tool which which you build your project may change. Don’t commit to one right away. Experiment. This is probably most true for database projects, or projects that involve any programming component that you are building yourself; however, it’s wise to keep this in mind for any project.

A platform is the environment that you build your project in: Omeka is a platform; so is MySQL. Every platform has limitations. Some of those limitations will be obvious at the start; others you should expect to discover through experimentation. Even if you think you find the perfect platform right away, keep an eye out for other candidates; and if you find any, play with them and see what you learn. (I built prototypes of VP in HTML, Fabular, TEI, MySQL, Simile Timeline, and Scalar — before I figured out that linked open data was likely to be the best fit).

Be wary of the newest, shiniest platforms. Be especially cautious, perhaps, if they are accompanied by amazing hype, and/or come from new companies without much history. Some digital formats are fairly transferable: they can be read in different programs, copied and pasted, etc.; others are not, and if the platform disappears, users may be out of luck.

7. For any platform you’re experimenting with, figure out how to back it up before you invest too much energy and data into it. The simplest way to do this is to look to see whether the platform will allow you to download your work into a file. Note: downloading a screenshot is not the same as downloading your work.

If the platform allows you to download a file, then you can google to see what other programs can read that file (try “programs that read <file format>”, and learn how durable it is.

If the platform doesn’t allow you to download any file, or any file that can be read with an alternative program, then use the platform with extreme caution, for experimental purposes only.

8. Learn about your data, and don’t shy away from the messy parts. Some data/images/texts may fit very neatly into the categories you’ve established for them. Almost certainly, some won’t: the world doesn’t fit into neat categories today, and it didn’t fit any better in historical periods. It can be tempting to exclude data that doesn’t quite fit in order to make a “neat” demo; however, you do so at the risk of oversimplifying your project; not to mention the risk of ignoring what might turn out to be a fascinating complexity in whatever you’re working on.

Understanding how messy your data is can substantially impact your choice of platform. For example: when I started Visible Prices, I thought that I would have approximately two types of entries: one would be the things being sold (and those entries would look basically similar); and one would be the bibliographic metadata. I expected that both types of entries would have approximately the same shape. As it turns out, they don’t — my data were far more heterogenous than homogenous — and that was a deciding factor in choosing to focus on RDF and linked open data, rather than MySQL.

Note, however: getting acquainted with the messiness of your data doesn’t necessarily involve incorporating all of the messiness in your experimental prototypes. Trying to do so may be more exhausting than productive: you may need to try incorporating one (or a small few) types of irregular data into your project at a time; or consider whether the project must deal with all of them in order to be valid. Is incorporating them the best strategy? Would you be better served by writing an essay that discusses their significance?

9. Be able to consider your project from many angles. Whatever your project, however complex the programming, learn how to be comfortable discussing the critical question you’re investigating without discussing the technological side. This is important both as an exercise for thinking about your research question, and as a means of communicating with people without overwhelming them — whether or not they’re well-versed in digital humanities.

The reverse is true, too: learn how to be comfortable discussing the technological framework of your project without discussing the critical question. (Ex: I have two data sets containing approximately 10,000 words each, and I need to be able to see how frequently verbs are used throughout each, and then compare that frequency.) Being able to talk through your project’s technological actions can help you communicate with people who have more experience in tech, or help you recognize when you encounter a platform that might fulfill your needs. However, it’s also generally important, because developing any sort of project requires not just building it, but thinking about it — a lot.

10. When you’re talking/writing about your project, make it intertextual: find ways of responding to prior criticism, and other projects. Pre-digital humanities criticism may well have relevant arguments; and your scholarly introduction and/or rationale are good places to engage with them. However, you can also engage in smaller ways, too. If you see potential for a discussion, write it up. Perhaps you’ll publish it on your blog; or perhaps it will become the basis of a journal article or book chapter. Depending on your audience, this is one area where #9, above, becomes important. I said this in the unconference session yesterday, and I’ll say it again here: if you tell me that no other criticism is in any way engaged with your research question, I will tell you that you haven’t looked hard enough. Responding to criticism isn’t just fulfilling the disciplinary conventions of intertextuality; it’s a vital part of the framing for your project.

11. Consider questions and problems that you encounter as potential essay topics for peer-reviewed journals like Digital Humanities QuarterlyHybrid Pedagogy, the Journal of Digital Humanities, etc. Look around to see which journals/sites are interested in project development narratives; and read specifically to see what kind of essays people are writing about unfinished projects — could you adapt their angle/form for your project?

12. Find ways of allowing faculty/staff/committee members to help you. This is especially important if you are working with people who are less experienced with digital humanities than you are; and even if your digital project is not an official part of your degree adjudicated by your institution, your committee members can play a variety of supportive roles. Simply having them feel able to mention your project to other academics they know can be important. But in order for them to do this, it’s helpful if they’re invested in your project, and feel that they can contribute to it. My experience is that there’s enough friction and obfuscation between “traditional” and “digital” humanities that it may not be obvious to your advisors how their experience and knowledge can be useful to you — but because digital humanities projects are rooted in both the digital, and the humanities, your advisors most likely do have insights that are worth hearing. (See also points 5, 9, and 10 above.)

13. Take notes, and keep records of all your decisions and experiments, and the reasons you choose one option or another. They are likely to be important as you go, even if they seem trivial — they’re fodder for future experiments/discussions/articles about your project. It’s possible that you should try to make them publicly available to some degree, but that’s a separate choice, and how much time you have to clean them up may be an issue (see below). But you need documentation — in part because it allows you to see how far you’ve come.

I am sometimes asked whether it is a good idea to publicize one’s project (and the technological work behind it), or to keep it hidden, so that no one else can steal it. I don’t have a single answer to this. When I came up with the idea for Visible Prices in fall of 2009, one of my technical advisors cautioned me to not talk too much about it until I knew how I would build the thing (i.e., until the project was close to being a reality). I understand why he made that recommendation: it had to do with the newness of student projects, and the state of the social media ecology (blogging was still rather controversial, academic Twitter was at best inchoate). Somewhere between then, and now, it became clear to me that the best way to establish that the project was mine was precisely to talk about it; and that if I didn’t find a way to talk about it, then it would never get built.

14. Reflect, reflect, reflect. Or in other words, find more paradigms for thinking about your project and its progress than “is it done yet?”, because that gives you only two options: “Yes! Hooray!” or “No; I’m a failure.” As scholarship, as a resource, as an object built with technology, as an experience through which you learned new skills and techniques, your project is a complex event (“event” really is the best word for it, I think, because the event is a separate phenomenon that overlaps with the project). Consider the emotional boost or drain that you get from pursuing the project, and how that boost/drain affects your work on your dissertation. Be aware of the impact of taking on a project on your identity and authority within your department and university.

When you are attending any sort of training (whether a DHSI-type event, or something else), you will want to reflect on the outcomes of the training on your project.  This week at DHSI in DH for Chairs and Deans, we’ve spoken briefly about learning outcomes, and the fact that it’s important to make learning outcomes clear and transparent. Even so, because DH is a varied field characterized by especially dynamic practices, the official outcomes may not be the only outcomes, or even the most relevant outcomes for you. Thus, it is equally important for learners with projects to be articulating independently what they are discovering and why it matters.  I make this recommendation especially because in the last four years, there have been several occasions where I’ve spoken with people who have ended up in the trough of disillusionment, (as described by Melissa Terras in her recent lecture, “A Decade in Digital Humanities”). Now, I don’t actually think that the trough of disillusionment can be avoided — but reflecting and developing your own outcomes is key to getting out of it, and to not spending more time there than necessary.

15. Know what has to be prioritized, and allow things to slide if necessary. If you go to my Visible Prices website, you’ll find it very sparse in some ways. Even the updates page is sparse and underwritten. To some degree, the same is true of DMDH. That is because in the past few years, my dissertation has absolutely positively had to come first. I have needed to do as much as I could on VP and DMDH without depleting the energy that I could use on the diss. In one sense, they are projects; in another sense, they are energy generators, and I have had to choose how to use that energy — and how not to use it. There are people for whom the trajectory of graduate school and DH leads to a shift into a different career, rather than completing the dissertation — (and that’s really a separate discussion) — but I wanted to finish mine, and something had to be sacrificed/put on the back burner. Something always has to be sacrificed.

Now, I could have published a stream of messy notes, some of which would have been intelligible, many of which would have been opaque. But as a graduate student, that’s actually a tricky decision. There is a lot of pressure on graduate students to be tremendously articulate in order to not be dismissed as people who ought to be working on their dissertations, as opposed to faffing about on the internet. There’s an additional complexity: graduate students who may want to apply for alt-ac jobs with social media components will be judged based on their communication skills and articulacy; and it’s hard to predict whether readers will understand or accept the framing of a sandbox/rough-cut set of posts. Now that I’m done with the diss, I have more resources to devote to both VP and DMDH; and doing so will be important before job market season in the fall.

However you decide to navigate workload issues, almost certainly, at some point, something will slide. When it does, don’t punish yourself for it. In the same vein, I recommend celebrating Every. Tiny. Victory.

This post covers just about everything I recommended in Wednesday’s unconference session (and expands a couple of my recommendations slightly). It’s not complete,

In closing, if you found this post useful, then you may also find the curriculum that Brian & Sarah & I created this last winter for DMDH helpful. Those materials focus on seeing your subject area as data, and exploring what programming languages/platforms might be a good fit for your project.

DMDH Winter 2014 Workshop #1: Exploring Programming in the Digital Humanities

DMDH Winter 2014 Workshop #2: Programming on the Whiteboard

And finally, what I’ve written up here covers aspects of getting a project “off the ground” — but it does not cover project management, which is also vital. However, that will have to wait for another unconference session.