The Code Problem

The issue on my mind is whether digital humanists ought to code. This topic comes up all the time and represents what appears to be an incredibly vexing issue among rising digital humanists who want to join the club, especially women, who can see right away that most coders are male and behave like they are members of some kind of boys club, in spite of their chummy invitations to join. Most recently Miriam Posner blogged about her problems with the Thou Shalt Code commandment that seems to pervade DH today. I won’t reproduce her arguments, with which I am sympathetic, except to say that I think the issue goes beyond gender and race–white males feel this anxiety too; they just don’t have the added burden of a culture that tells them this particular shortcoming is the fault of their essential nature. So they can blow it off and not feel too bad about it. Which is of course consistent with Miriam’s point (and of the XKCD cartoon she embeds). But I want to get at another issue beyond the background radiation of implicit bias. I want to get at the rationale for coding in DH in the first place, which is I think the real cause of most of this anxiety. Because implied by the exhortation to code are a bunch of unspecified assumptions about code, its nature, why it is important, and why you should learn it. Some of these assumptions are just wrong-headed and, as we are seeing, divisive.

Let me start by giving my reasons for why a person should learn to code if they want to fully participate in this DH thing that is happening right now and perhaps get hired in the field. The first is to learn for the reason that Tim Berners-Lee exhorts journalists to learn–you need to know how to use tools to manipulate data because knowledge is increasingly produced as data, that is, in more or less structured forms all over the web. This is because the future of the humanities “lies with ~~journalists~~ humanists who know their CSV from their RDF, can throw together some quick MySQL queries for a PHP or Python output … and discover the story lurking in datasets released by ~~governments, local authorities, agencies,~~ digital archives, online libraries, academic centers, or any combination of them – even across national borders.” That is, you should know how to write scripts in a language that lets you pull and munge and mess around with data, including texts. In this view, you don’t have to write full blown applications, and you don’t have to know a particular language–a huge source of anxiety, of which I say more below.

The second reason to learn to code is philosophical. You should be able to write code–not necessarily program or, God forbid, “develop”–so that you can understand how machines think. Play with simple algorithms, parse texts and create word lists, generate silly patterns a la 10 PRINT. Get a feel for what these so-called computer languages do. Get a feel for the proposition, to which I mostly assent, that text is a kind of code and code a kind of text (but with really important differences that you won’t discover or understand until you play around with code). This level of knowledge does not require any great mastery of a language in my view. It only requires a willingness to get one’s hand dirty, make mistakes, and accept the limitations of beginner’s knowledge. I personally believe that this second reason is as or more important than the first. It will connect you with the heart of DH–the encounter with the machine–that, in my reading, Willard McCarty recently described in his Busa Award acceptance speech. (I just wrote something about that here.)

I don’t mean these are the only reasons to learn code. I mean these are two very basic reasons that a digital humanist wanting to code should consider. There are of course lots of other reasons for people to go deeper in the rabbit hole. In my own case, I am turned on by data models and the relationship they have to the more general human practice of cultural modeling that defines our species. But that’s me.

Now, here’s the thing, the problem that I have with the culture of code in DH today: To get to this place with code, to be able write simple scripts that are useful or interesting or both, you don’t need to do many of the things your coding brethren think you should do. First and foremost, you don’t need to learn a specific language unless there is a compelling local reason to do so, such as being in a class or on a project that uses the language. This is a really important point. Many will tell you that you HAVE to learn Python or Ruby or R. But this kind of talk is off-putting and counter-productive for those wanting to get into coding. Many of these languages are actually pretty hard. Ruby, for example, forces beginners to understand things like symbols versus strings as they are learning what variables and data types are. And R expects you to understand statistics. The language you choose should really depend on what you want to do, and if you want to do what I am describing above, then many languages will do. All other things being equal, I suggest choosing a simple language, one without strong opinions (like “Everything is an Object!”) and without the need to learn ten things before you can print “Hello, World!” Also choose one with good online documentation and a large user community. PHP, for example.

Second, you don’t need to be involved in writing a full-blown application to do DH-worthy coding. Applications are fine, and being on a collaborative project has huge benefits of its own, but know that application development is a huge time-suck and that applications are like restaurants–fun to set up but most likely to fail in the real world. Lots of DH coding projects in my experience are journeys, not destinations. People get involved with them to learn how to code and to collaborate and to have something cool to show. But real, useful applications–like Zotero or Neatline–are developed by full-time coders who know their business. And these developers are not writing dissertations in history or literature. So don’t use these kinds of projects as a measure for a general DH coding competence level. One other thing to mention about applications, as long as I’m ranting–applications can be DH works in their own right, but the best ones are just useful to those conducting research in the humanities per se, that is on culture and history and literature. In the sciences, people write applications and give them away so that others can use them too. They don’t make careers out them. Digital humanists should do the same.

Third, there is no reason ever to be forced into using a specific editor or coding environment, especially if it is a difficult one that “real” coders use. To be more specific: don’t let anyone tell you that you have to use vim or emacs. These are great editors with good pedigrees, but forcing them on new coders is akin to hazing. To the new coder, the editor is just another thing to learn. New coders should use something simple that feels comfortable and does not require a manual. Having said this, default editors like Notepad and TextEdit are really bad and should not be used. I give a suggestion below.

Beyond these specific problems, though, there is a more fundamental issue about the culture of code that contributes to the condition that Miriam and others confront: in spite of the well-meaning desire by many coders to bring everyone into the coding fold, there is a countervailing force the prevents this from happening and which emanates from these same coders. This is the force of mystification. Mystification appears in many forms, including some of the things I just described–insisting on a difficult editor, dissing certain languages–but it more generally comes from treating code competence as a source of identity, whether it be personal or disciplinary. As long as digital humanists regard coding as a marker of prestige–and software as a token in the academic economy–and not as a means to other forms of prestige (such making discoveries or writing books), then knowledge of coding will always be hedged in by taboos and rites of passage that will have the effect of pushing away newcomers.

I see the effects of the software-as-prestige-good model all the time, and it doesn’t just happen among DH coders. It happens when programmers use the epithet “script kiddie” to dismiss those who don’t write applications, or who are not formally trained in computer science or software engineering. It happens when HTML is not considered code because it lacks conditional logic, when in fact writing HTML is an excellent entree into coding for lots of reasons. And it happens any time “Introduction to Computing” is taught with Java as the language. For to begin with a language that forces data typing, declaration of variables, class definitions and instantiations, and all sorts of other encumbrances on a new user is not to teach, it is to filter.

I speak from some experience here. I teach code to about forty students, mostly women, every year, in a course sequence on the Digital Liberal Arts. My approach is very simple–do as little as possible to get between the student and the act of coding. I teach them PHP, because it is dirt simple and asks very little of them at first, and I have them use a basic cross-platform text editor (JEdit) that does not require them to learn keyboard patterns to get started. I tell them that coders are more like artists than engineers, failing early and often, messing around, throwing things away, and often surprised by their own results. I tell them that coding is like writing except when it’s not. And I teach them to do something useful with their knowledge of code, like grab a CSV file and convert it into a network structure that can be parsed by a network tool like ManyEyes, Gephi, or SHIVA, and dumped into an essay about something. I get good reviews and many of my students go onto post-graduate programs in related fields. What I am most surprised by, though, are not the Media Studies students who, without prior experience, end the course with a confidence to code, but the computer science students who tell me they feel liberated by the course, who feel like they’ve discovered the beauty of code for the first time. In demystifying code, I liberate students to engage with it more authentically. (Ironically, I believe that the demystification of code defamiliarizes it for CS students, who then see things differently.)

So what I’m saying is this: DH coders should be doing everything possible to demystify coding for new comers. I think doing this would dispel a ton of bad will and shift attention away from bias and toward the difficulties of talking to machines in the first place. And I don’t mean dumbing it down either. I mean beginning with the intelligible and proceeding to the less intelligible and indeed more mysterious. The thing to remember is that the code is the thing–not the language, not the development environment or the operating system or the application. If you are a new coder, anything that gets in the way between you and the ability to make utterances in code is a problem.

July 25, 2013

Rafael Alvarado

Digital Humanities

2 responses to “The Code Problem”

5 Things Thursday: Adaptive Metadata, Coding, Archives | MOD LIBRARIAN says:

August 1, 2013 at 9:10 am

[…] and digital humanists learn coding? Check out this post on Cataloging Futures and its excellent linked post to consider. My take – a resounding yes, though not necessarily to do coding themselves, but […]
Lisa Spangenberg says:

August 12, 2013 at 2:03 am

While I would absolutely encourage everyone of every possibly field to learn to code, I don’t think coding in the sense of a compiled language, or even scripting in the sense, say, of Apple Script or JavaScript, or shell scripts is necessary.

I think a sufficient facility with any language the ability to parse it at the levels of syntax and grammar as well as context is a fair substitute. I would also accept a sufficient level of utility and understanding of HTML/XML and CSS, or even a sophisticated understanding of Boolean logic, as quite adequate.

I am a humanist. I also have been working in the software industry since 1989. You don’t have to write code to understand it, any more than you have to be able to write Medieval Latin or Modern German in order to understand it in written or even spoken forms.

What we want in humanists are patterns of thought and kinds of analytic skills used in coding to thrive right along with more textual and traditional Humanistic skills. The skills involved in parsing a human language and flipping back and forth from the micro level of morphology and grammar to the macro level of context and textual surface are strikingly similar to the code/UI or structure/content or even the Lanham variant of At/Through.

As someone hiring and working with software engineers and programmers that in very broad terms, those engineers and programmers that were particularly skilled and joyful in their approach to code (and humanists) tended to have additional extraordinary levels of expertise in either human languages, or music.

The Transducer

The Code Problem

2 responses to “The Code Problem”

Leave a Reply