Tag Archive | Digital

Twenty years of computer-assisted research

Our household has been in a bit of a spring cleaning vibe (new bookcases will do that), which inspired me to get rid of a bunch of old electronics dating from the Pleistocene. In addition to recycling some pocket electronics (an old digital recorder and an old Dell Digital Jukebox MP3 player – and where or where did my old c. 2004 Dell Axim go?), we also are unloading one very old (486?) PC and a bevy of laptops, which made me briefly reminisce on all the laptops I’ve loved, and hated, before (sung with a Willie Nelson twang): Read More…

What should historical research look like in an age of digital collaboration?

Historical research, as most of us know, has traditionally been a solitary practice. Even in this postmodern age of killa’ collabs and remixes with co-authors named feat., historians, by and large, are still a lonely bunch of recluses. Admittedly, one’s choice of subject has a lot to do with how crowded your subfield is. Unfortunately (or not?), I’ve rarely been in a position where I knew somebody else who was actively researching the same war as me (War of the Spanish Succession) and might want to look at the same sources. John Stapleton is the closest example from my grad school days, and he focuses on the war before “mine,” so we’ve given each other feedback and pointed each other to various sources from “our” respective wars over the years. In general, though, it’s been kinda lonely out here on the plains.

But the times they are a-changin’ and the prairie turf is being transformed into suburban subdivisions. The question is whether all these houses will follow a similar aesthetic, whether their architecture will reference each other, or whether the only communication between neighbors will consist of vague nods at the grocery store and heated arguments over how far their property line extends. (Thus far, subdivisions are still segregated into ethnic neighborhoods.)

If we look beyond the discipline of History, we’re told that it’s an age of collaboration (CEOs say they want their new employees to work effectively in teams) as well as the age of information overload (I believe that – my main Devonthink database has grown to 104,000 documents and 95 million words of text). Even the other kind of doctors are having a rethink. Now this whole Internet thing allows like-minded individuals to communicate and commiserate across the planet, and not just with their neighbor next door. “Global village” and all that. As a result, even historians have figured out that we can now find out if we’re alone in the universe or not – I assume everybody has Google Alerts set for their name and publication titles? This academic version of Google Street View certainly has certainly expanded my worldview. My one semi-regret is that, thanks to online dissertations, conference proceedings and even blogs, I now find out I was in the archives 10-15 years too early, and there are currently a bunch of people both American and Euro looking into the period – and by “bunch” I mean maybe 6-12. Even more reasons for making connections. Hmmm, someone should create a blog that allows EMEMH scholars to communicate with each other…

So how should historical research work in this interconnected digital age, in this global, digital village? In an age when the moderately-well-heeled scholar can accumulate scans of thousands of rare books and hundreds of archival volumes? The combination of collaboration and digitization has opened up a spectrum of possibilities, and it’s up to us to decide which are worth exploring. Here are some possibilities I see, stretching along a spectrum from sharing general ideas to swapping concrete primary sources (Roy Rosenzweig undoubtedly predicted all this twenty years ago):

  • Topic Sharing. The way it’s traditionally been done, in grad school, or if people meet up in the archives or at a conference or on fellowship. You let people know the specific topics you’re working on, and let it progress from there: “Oh, you’re working on X. Do you know about …? Have you checked out Y? You should really look at Z.” This has two advantages: first, it allows participants to keep the details of their research close to the vest, and more fruitfully, it allows the historiography to develop into a conversation rather than separate ships passing each other in the night – it’s such a waste when something gets published that really should have looked at X, Y or Z, but nobody suggested it. Or, perhaps peers studying the same period/place offered comment, but other potential-peers studying the same theme didn’t (or vice versa). Sharing subjects also forces people to acknowledge that they might not be the only person writing on topic X, and encourage them to consider whether they might want to divvy up topics rather than writing in ignorance of what others will be publishing, or already have written. Say, hypothetically, when one thinks they want to write a chapter about how the French viewed battle in the War of the Spanish Succession, and then discover that another scholar has already written about a thousand pages on the subject. So letting others know what you’re working on would be a start: type of history, subject (sieges? battles? operations? logistics?…), type of study (campaign narrative? commander biography? comparison of two different theaters?…), sides/countries (including languages of sources being used), and so on.
  • Feedback and advice. This requires longer and more sustained interaction, but is far more useful for all involved. I’m not convinced by the latest bestseller claiming that the crowd is always right, but crowdsourcing certainly gives a scholar a sense of how his/her ideas are being received, and what ideas a potential audience might like to read about in the first place.
  • Research assistance. Here, I would suggest, is where most historians are still living in the stone age, or more accurately, are on the cusp between the paper and digital ages. Most of our precious historical documents survive entombed within a single piece of paper(s), in an archive that may require significant costs and time to access. Depending on a government’s view of cultural patrimony and the opportunity for a marketable product, a subset of those documents have been transferred to the digital realm. But not many. This is where many historians need help, a topic which we’ve discussed many times before (as with this thread, which prompted the present post), and where collaboration and digitization offer potential solutions to the inaccessibility of so many primary sources.
    But there is a rather important catch: copyright. Archives and libraries (and publishers, of course) claim copyright over the documents under their care, and they frown upon the idea that information just wants to be free (ask Aaron Swartz):
    CAC copyright slipSo this puts a bit of a kink in attempts to create a Napster-style primary source swap meet – though I am getting a little excited just imagining a primary-source orgy like Napster was back in the day.
    Fortunately there are steps short ofscofflawery. Most of these revolve around the idea of improving the ‘finding aids’ historians use to target particular documents within the millions of possibilities. These range in scale from helping others plan a strategic bombing campaign, to serving as forward observer for a surgical strike:

    • A wish list of specific volumes/documents that somebody would like to look at. This could be as simple as having somebody who has the document(s) just check to see what it discusses, whether it’s worth consulting. This, of course, requires a bit more time and effort than simply sharing the PDF.
    • Or it might mean providing some metadata on the documents in a given volume. For example, I discovered in the archives that if the Blenheim Papers catalog says that Salisch’s letters to Marlborough in volume XYZ cover the period 1702-1711, and I’m studying the siege of Douai in 1710, it is a waste of one of my limited daily requests to discover that Salisch’s letters include one dated 1702, one from 1711, and the rest all on 1708. The ability to pinpoint specific documents would in itself be a boon: many archives have indexes and catalogs and inventories that give almost no idea of the individual documents. Not only would it save time, but it might also save money if you want to order copies of just a few documents rather than an entire volume.
    • Or, such assistance could be as involved as transcribing the meaty bits of a document. Useful for full text, though purists might harbor a lingering doubt about the fidelity of the transcription.
    • Or, it might mean running queries for others based off of your own database. I did that for a fellow scholar once, and if you’ve got something like Devonthink (or at least lots of full-text sources), it’s pretty easy and painless. Though if there are too many results, that starts to look a bit like doing someone else’s research for them.

Of course with all of these options, you have to worry about thunder being stolen, about trusting someone else to find what you are looking for, etc., etc. And there probably isn’t a good way to assuage that concern except through trust that develops over time. And trust is based on a sense of fairness: Andy’s questions about how to create a system of calculating non-monetary exchanges have bedeviled barter systems for a long time, I think.

As usual, I don’t have a clear answer. Simple sharing of documents is undoubtedly the easiest solution (cheapest, quickest, fewest number of eyes between the original source and your interpretation), but I don’t have a system for the mechanics. Nor am I clear on the ethical issues of massive sharing of sources – is “My thanks to X for this source” in a footnote enough? If some documents are acquired with grant funds, can they be freely given away? And the list goes on…

Thoughts?

Sieges as they were meant to be seen

New article in Social Science Computer Review using GIS to analyze the 1714 siege of Barcelona.

Rubio-Campillo, Xavier, Francesc Xavier Hernàndez Cardona, and Maria Yubero-Gómez. “The Spatiotemporal Model of an 18th-Century City Siege.” Social Science Computer Review, November 17, 2014, 0894439314558559. doi:10.1177/0894439314558559.
Abstract:
The importance of terrain in warfare has often encouraged an intense relation between military conflicts and the use of techniques designed to understand space. This is especially relevant since the modern era, where the engineers who built and assaulted city defenses recorded the events with diverse documentation, including reports, diagrams, and maps. A large number of these sources contain spatial and temporal information, but it is difficult to integrate them into a common research framework due to its heterogeneity. In this context, geographical information science provides the necessary tools to explore an interdisciplinary analysis of these military actions. This article proposes a new approach to the study of sieges using a spatiotemporal formal model capable of integrating cartography, archaeological, and textual primary sources and terrain information. Its main aim is to show how concrete research questions and hypotheses can be explored using a formal model of this type of historical events. The methodology is applied to a particular case study: the French–Spanish siege of Barcelona that occurred in 1714. The protagonists faithfully recorded the development of the action, providing essential information for the model. Besides, different authors depicted the event as the paradigm of a city siege. For this reason, the model is also used to explore why real actions deviated from theoretical guidelines, clearly defined in different manuals. We use this scenario to explore two issues: (a) why the attackers chose to assault a particular city sector and (b) the factors that explain the casualties of the besiegers. We conclude that we need methodological tools capable of integrating heterogeneous information to improve the understanding of siege warfare that affected not only military conflict but also the shape of European urban landscapes.
That article includes some interesting discussion and insightful maps of the attacks, siege casualties, etc. Now if only somebody did it for every siege! I’ve got dibs on Douai 1710, if I ever take the time to play around with GIS.
With other military historians finally catching up with the serious study of Louisquatorzian siegecraft, I may need to dust off a few ideas I had in dissertation version 0.5 (all done in AutoCAD):
Siege batteries, Douai 1710

Siege batteries, Douai 1710

:

Casualties by approach by day, Douai 1710

Casualties by approach by day, Douai 1710

I also have the number of daily workers, so a casualty rate over the length of the siege could easily be calculated.

Douai 1710 trench work

Douai 1710 trench work

And, finally, a colorful map that emphasizes the importance of musketry for the defense:

Garrison volume of fire (theoretical)

Garrison volume of musketfire (theoretical)

Now I remember why it took me so long to finish my dissertation – because I wrote 1.5 of them instead of just one.

Digital Dawdling

As academics on a semester system know, Thanksgiving break offers the false hope of a brief interlude before the final dash to the end of the semester. Thus I surfaced for air long enough to waste some time playing around with a few new-ish digital toys that might be of interest to others.

First, for those who use Pocket Informant’s calendar/task-management program, their recent update includes a macro-view (all the cool kids are doing Big Data these days) of your schedule, a heat map indicating how busy your days are over months. As you can tell from the screenshot, I follow the stereotypical academic’s schedule of attempting to keep my summers for my research.

PI heat map: I'll let you figure out what my teaching schedule is...

PI heat map: I’ll let you figure out what my teaching schedule is…

More productively, I decided to waste some more time on mind mapping software. Devonthink is great for storing all my documents and notes, but I still find the need for meta-notes (or organizational cues, or trains of thought) that are extremely hierarchical, and which have to come in a very specific order even if I don’t know where exactly they should go in the overall argument – often these are a series of successive questions that I need to follow up on. You could put them in a group in DT, but that tends to lose the specific train of thought. So instead of pulling out my big sketchpad and writing out a mindmap of my battle book, as I did with my diss, I got a copy of Xmind software. This way I can have my mindmaps everywhere I am, and I can move things from one node to another without having to erase and rewrite. The resulting map for a smaller project (my honor in sieges book chapter) looks like this:

Mindmap of honor in sieges chapter

Mindmap of honor in sieges chapter

The map is fully searchable, you can add various ‘markers’ and icons, modify the formatting of each point, add images, create floating points (when you’re not yet sure where exactly they should go), and it automatically makes an outline that you can export (upper right in screenshot). I find it useful to see the big picture on a single page (scrolling and zooming in and out as necessary), and to quickly see the ‘shape’ of the argument and the relative amount of detail in each section, rather than flip between a dozen pages of outline and try to imagine how a subpoint would fit in a different spot.

Finally, my frequent reliance on timelines in my courses led me to take the plunge and explore timeline software. My über-efficient timecharts have their uses, but I don’t want to put that amount of effort into all sorts of chronologies in the dozen different courses I teach. Sorry, but the 20th century isn’t worth that much effort. And for my own research purposes, the more info in a given timeline, the greater the need to have the info quickly searchable.

Enter Aeon Timeline. Items are generally divided between Entities (people, institutions, technologies…) and time-defined Events. You can use different levels of precision for different Events, and you can place Events on various arcs, e.g. an operational timeline might include separate arcs for each theater of operations. Befitting the digital data, all entries and metadata are searchable, and the timelines are zoomable in both directions. You can add notes to each Entity and Event, and there are a few limited formatting options (with possibly more to come in future versions). So in the operational arcs I indicate the Allied sieges with a red font and the Bourbon sieges with a blue font; in the English politics arcs I use buff to indicate the Whigs and blue to indicate the Tories. You can import images, for example peoples’ portraits or even simplified maps of battles and sieges. You can also filter your results to show only a subset of the events and entities, based off of the metadata. You can also import in massive quantities of data in csv or tab-delimited, rather than use the individual event creation dialog box.

Aeon timeline

Aeon timeline: events on arcs

Further, you can define a Relationship between each Entity and each Event – e.g. an Entity might have one Event that was its birth, another its death, while another Event of that Entity (say, a person) might be that individual’s participation in a particular siege. This view is a bit messy in the Event (top) half of the window – you should primarily just look at the bottom half, in the Relationship view, which allows you to see all the events that each entity was involved with – and even how old the given Entity was, if you want. The developer promises to make this view more intuitive in future versions. And, if I were ever to make my own WordPress blog site (i.e. not use wordpress.com), I could export the timelines in simile format and post interactive versions online.

Relationships between entities and events

Relationships between entities and events

So that’s how I spent my Thanksgiving week, when not eating turkey, that is.

Buddy, can you spare a scribe?

Interesting NY Times story on the increasing use of scribes by physicians – you know, those who claim to be “doctors.”

Three weeks of training gets you a scribe that follows you around with a laptop in hand and takes notes on your interactions with patients, with the scribe company charging $25 per hour ($8-$16 for the scribe). Sounds like something academics could use: there don’t seem to be nearly enough research assistants floating around. Only problem: that going rate is a bit high.

"Tell me where it hurts."

“Tell me where it hurts.”

Apparently all the computerization is one of the biggest complaints among physicians. A money quote from the article: “recent article in the journal Health Affairs concluded that two-thirds of a primary care physician’s day was spent on clerical work that could be done by someone else; among the recommended solutions was the hiring of scribes.

From one doctor to another, I hear ya. Though History must be more challenging, because I’ve had limited success getting some of my department’s past office workers to do much more than photocopy.

Computerized medical records were supposed to make everything efficient, but I guess they forgot the lowly data-entry clerk. I didn’t. So now we’re going back to the days when secretaries actually did typing for doctors, at least the medical kind. Funny how technology sometimes takes you in circles.

Keeping tabs on the discipline

The new issue of the Journal for Military History is out. A year ago, I decided to switch from the print version to the digital online. Unfortunately I didn’t realize that I wouldn’t receive any kind of email reminder to check when the new issue was up, and I’d keep forgetting my password and have to search for it in my email. Yet another reason to use that repeating reminder on your calendar app – or at least look and see if the journal’s publisher has an email alert setup. (Speaking of reminders, sometime I’ll do a post on my Pocket Informant setup – it won’t be useful for those who are quite happy with their calendar/task system, but it might be of interest to others.)

So I searched my way to the SMH webpage, and was pleased to find a webpage for each issue back to 2007. A few disjointed thoughts came to mind, all revolving around the question of how we should be ‘doing’ history in this digital age.

1. These webpages conveniently include the titles and abstracts of all the articles, including the articles on modern military history that I don’t read, as well as a list of the books under review. The abstracts in and of themselves are a significant advance, since the JMilH, and History generally, was surprisingly late to the abstract party. It used to be that most history journals didn’t print the abstracts along with the articles – they were only to be found in abstracting services like Historical Abstracts and America: History and Life (subscription only, of course). The JMilH only started including abstracts within the past several years I think – at least I recall I had to write one for my 2000 article, but it wasn’t published along with the article.

I can’t say why the historical discipline was so late to appreciate abstracts, unless it was seen as smelling too much of the sciences, natural and social. (Published abstracts also raises the question of what the point is in assigning students to abstract journal articles if the author already created one, though that wouldn’t be the only way we should be changing how undergraduate history is taught. But I digress…)

2. Such abstracts have a broader effect beyond simply identifying the argument of each article. I’m far more likely to read a 75-100 word abstract of an article on World War II than read the 25-page article; presumably modern military historians feel the same way about pre-modern topics, Europeanists about Asianists, and so on. I’m that much more likely to read them if all of the abstracts are on a single page, so I don’t have to page through the journal issue to the first page of every article. It’s thus much easier to see connections (or lack thereof) between different periods and places, without having to wait for the occasional historiographical article to be published. Now if only people would start publishing their ideas in argument maps.

3. What the provision of these pages also means, of course, is that you can easily import them into a DTPO database, making their full text available for any searches you might perform. Again, ease of use (ease of reading, ease of copying) makes a huge difference for items that are of marginal importance – probably why Zotero libraries (one-click and it’s downloaded) tend to be much larger than bibliographic databases where you have to enter all the information in manually.

4. You can also get a quick glance at what the (sub)field is interested in with this info – we’re finally starting to get easy access to our disciplinary information, rather than having it locked behind subscription databases like EBSCO, JSTOR, etc. I’ll post the word cloud to the SMHBLOG for those interested.

5. One of the articles in this issue is Jon Sumida, “A Concordance of Selected Subjects in Carl von Clausewitz’s On War,” The Journal of Military History, 78:1 (January 2014): 271-331. Its abstract:

This concordance of the standard English translation of Carl von Clausewitz’s On War by Michael Howard and Peter Paret breaks new ground in two important respects. First, it indexes the text in unprecedented detail by listing references to every significant proposition and distinctive phrase under major subject headings. Second, information about the location of indexed items includes the book and chapter of On War, and page numbers in both current editions of the standard translation.

I don’t have access to the issue yet, but it would be interesting to compare Sumida’s results with the original index in Howard/Paret – is Sumida’s article an indictment of the original? It would also be interesting to compare Sumida’s article with what one could uncover just taking the full text and using various forms of text analysis – how much effort and specialist expertise was required to add that value, vs. what you can get from basic text mining? Perhaps Sumida even addresses this issue. The article also reminds me of a somewhat similar effort several years back, John Lynn’s “The Treatment of Military Subjects in Diderot’s Encyclopédie.” Which in turn prompts me to wonder to what extent such efforts will be needed once we have the full text of the documents directly available to us? Will reference works like concordances soon become irrelevant? Isn’t this yet another reason why we should have these sources in full text, so we can perform the same analysis on any number of sources?

So read up.

Download ’em while you got ’em

Those in the US know already about the federal government shutdown. Academic denizens of the Internet also know that this has led to the shuttering of the DC Zoo’s Panda Cam, but also to the Library of Congress’ website among various other federal research facilities and websites. It sucks for people traveling to DC for the archives – though for European researchers, they now know what it feels like to be in a foreign country with limited time and budget and all of a sudden the mass transit workers/government employees go on strike. Or like trying to buy a bunch of microfilm and being told non. Just sayin’.

Inside Higher Ed story.

Do you need another reminder to download every source while it’s still available to you? I hope not.

But downloading everything and categorizing it takes time that you don’t always have. To give an example of how lazy I’ve become, I am increasingly taking screenshots of search results in Google Books or ECCO: one-off searches that aren’t of critical importance, but might be more useful in the future. So I download a PDF of the work and just drag them both into DTPO. Makes me feel kind of dirty and it doesn’t take advantage of full text, but at least DT allows me to keep the info together. For example:

Screenshot in DTPO

Screenshot in DTPO