At least until the lights (or internet) goes down.
I’m preparing my appeal to you faithful skulkers to assist me in my quixotic quest to create a more robust and usable dataset on early modern European wars. I envision keeping it simple, at least at the start, posting a series of spreadsheets (possibly on Google Sheets) with information about various aspects of early modern warfare. We don’t want to start from scratch, so I’ve downloaded the basic information on the period’s wars and combats (“battles”) from Wikipedia, via Wikidata queries using SPARQL. And I’ve been learning about graph databases in the process, which someone might consider a bonus.
Wikipedia??? Well, the way I see it, they’ve already entered in a lot of basic information, and many of the factual details are probably correct, at least to a first order approximation. So it should speed up the process and allow us to refine and play around with the beta data (say that fast three times) before it’s “complete,” however that’s defined.
Once the data sheets are up online, we can clean that information, I can collate it, and then we can open it to the world to play with – analyze, map, chart, combine with other data, whatever one’s heart desires. If someone wants to deal with the Wikipedia bureaucracy, they can try to inject it back into The Source of All Knowledge.
In the meantime, if you’re curious as to what someone with some programming skills and an efficiency-oriented mindset can create, you should check out the following blog post, wherein a data scientist collects all of the wars listed in Wikipedia (Ancient to recent), and then explores their durations and a few other attributes. Very cool stuff, and you gotta love the graphics. Check it out at https://www.gokhan.io/post/scraping-wikipedia/. And just imagine what one could do with more granular data, and possibly more accurate data as well! Hopefully we’ll find out.
In the meantime, here’s a real simple map from a SPARQL query locating all of the “battles” listed in Wikidata (that have location information).
I’ll let you decide whether Europe and the eastern US really were that much more belligerent than the rest of the world. To the Methodology!
Where I’m at now, after reading more on GIS, historical and Quantum. Here we have the beginnings of my Low Countries theater map, for operational military history.
Features include rivers, the (modern) coastline, capital cities, fortifications (fortresses and forts) by side of garrison, a light tracing of the pré carré fortresses in northern France, and, for kicks, the woods of northern Belgium traced from the Austrian Ferraris maps, c. 1770s.
And more to trace, e.g. from the Pelet 1837 atlas:
Still lots of work to do, cleaning things up and adding additional features, like army marches and camps. Eventually, I’ll even work up to Print Composer and stop taking screenshots.
But in the meantime, progress moves forward.
So let’s say you’ve become obsessed with GIS (geographical information systems). And let’s also posit that you’re at a teaching institution, where you rotate teaching your twelve different courses plus senior seminars (three to four sections per semester) over multiple years, which makes it difficult to remember the ins-and-out of all those historical narratives of European history from the 14th century (the Crusades, actually) up through Napoleon – let’s ignore the Western Civ since 1500 courses for now. And let’s further grant that you are particularly interested in early modern European military history, yet can only teach it every other year or so.
So what’s our hypothetical professor at a regional, undergraduate, public university to do? How can this professor possibly try to keep these various periods, places and topics straight, without burdening his (errr, I mean “one’s”) students with one damned fact after another? How to keep the view of the forest in mind, without getting lost among the tree trunks? More selfishly, how can one avoid spending way too much prep time rereading the same narrative accounts every few years?
Why, visualize, of course! I’ve posted various examples before (check out the graphics tag), but now that GIS makes large-scale mapping feasible (trust me, you don’t want to manually place every feature on a map in Adobe Illustrator), things are starting to fall in place. And, in the process, I – oops, I mean our hypothetical professor – ends up wondering what historical research should look like going forward, and what we should be teaching our students.
I’ll break my thoughts into two posts: first, the gritty details of mapping the Italian Wars in GIS (QGIS, to be precise); and then a second post on collecting the data for all this.
So let’s start with the eye-candy first – and focus our attention on a subject just covered in my European Warfare class: the Italian Wars of the early 16th century (aka Wars of Italy). I’ve already posted my souped-up timechart of the Italian Wars, but just to be redundant:
That’s great and all, but it really requires you to already have the geography in your head. And, I suppose, even to know what all those little icons mean.
Maps, though, actually show the space, and by extension the spatial relationships. If you use PowerPoint or other slides in your classes, hopefully you’re not reduced to re-using a map you’d digitized in AutoCAD twenty years earlier, covering a few centuries in the future:
Instead, you’ve undoubtedly found pre-made maps of the period/place online – either from textbooks, or from other historian’s works – Google Images is your friend. You could incorporate raster maps that you happen across:
Maybe you found some decent maps with more political detail:
Maybe you are lucky enough that part of your subject matter has been deemed important enough to merit its own custom map, like this digitized version of that old West Point historical atlas:
If you’re a bit more digitally-focused, you probably noticed a while back that Wikipedia editors have started posting vector-based maps, allowing you to open them in a program like Adobe Illustrator and then modify them yourself, choosing different fills and line styles, maybe even adding a few new features:
Now we’re getting somewhere!
But, ultimately, you realize that you really want to be your own boss. And you have far more questions than what your bare-bones map(s) can answer. Don’t get me wrong – you certainly appreciate those historical atlases that illustrate Renaissance Italy in its myriad economic, cultural and political aspects. And you also appreciate the potential of the vector-based (Adobe Illustrator) approach, which allows you to add symbols and styling of your own. You can even search for text labels. Yet they’re just not enough. Because you’re stuck with that map’s projection. Maybe you’re stuck with a map in a foreign language – ok for you, but maybe a bit confusing for your students. And what if you want to remove distracting features from a pre-existing map? What if you care about what happened after Charles VIII occupied Naples in early 1495? What if you want to significantly alter the drawn borders, or add new features? What if you want to add a LOT of new features? There are no geospatial coordinates in the vector maps that would allow you to accurately draw Charles VIII’s 1494-95 march down to Naples, except by scanning in another map with the route, twisting the image to match the vector map’s boundaries, and then eye-balling it. Or what if you want to locate where all of the sieges occurred, the dozens of sieges? You could, as some have done, add some basic features to Google Maps or Google Earth Pro, but you’re still stuck with the basemap provided, and, importantly, Google’s (or Microsoft’s, or whoever’s) willingness to continue their service in its current, open, form. The Graveyard of Digital History, so very young!, is already littered with great online tools that were born and then either died within a few short years, or slowly became obsolete and unusable as internet technology passed them by. Among those online tools that survive for more than a five years, they often do so by transforming into a proprietary, fee-based service, or get swallowed up by one of the big boys. And what if you want to conduct actual spatial analysis, looking for geospatial patterns among your data? Enter GIS.
So here’s my first draft of a map visualizing the major military operations in the Italian peninsula during the Italian Wars. Or, more accurately, locating and classifying (some of) the major combat operations from 1494 to 1530:
Pretty cool, if you ask me. And it’s just the beginning.
How did I do it? Well, the sausage-making process is a lot uglier than the final product. But we must have sausage. Henry V made the connection between war and sausage quite clear: “War without fire is like sausages without mustard.”
So to the technical details, for those who already understand the basics of GIS (QGIS in this case). If you don’t know anything about GIS, there are one or two websites on the subject.
- I’m using Euratlas‘ 1500 boundaries shapefile, but I had to modify some of the owner attributes and alter the boundaries back to 1494, since things can change quickly, even in History. In 1500, the year Euratlas choose to trace the historical boundaries, France was technically ruling Milan and Naples. But, if you know your History, you know that this was a very recent change, and you also know that it didn’t last long, as Spain would come to dominate the peninsula sooner rather than later. So that requires some work fixing the boundaries to start at the beginning of the war in 1494. I should probably have shifted the borders from 1500 back to 1494 using a different technique (ideally in a SpatiaLite database where you could relate the sovereign_state table to the 2nd_level_divisions table), but I ended up doing it manually: merging some polygons, splitting other multi-polygons into single polygons, modifying existing polygons, and clipping yet other polygons. Unfortunately, these boundaries changed often enough that I foresee a lot of polygon modifications in my future…
- Notice my rotation of the Italian boot to a reclining angle – gotta mess with people’s conventional expectations. (Still haven’t played around with Print Composer yet, which would allow me to add a compass rose.) More important than being a cool rebel who blows people’s cartographic preconceptions, I think this non-standard orientation offers a couple of advantages. First, it allows you to zoom in a bit more, to fit the length of the boot along the width rather than height of the page. More subtly, it also reminds the reader that the Po river drains ‘down’ through Venice into the Adriatic. I’m sure I’m not the only one who has to explicitly remind myself that all those northern European rivers aren’t really flowing uphill into the Baltic. (You’re on you own to remember that the Tiber flows down into the Tyrrhenian Sea.) George “Mr. Metaphor” Lakoff would be proud.
- I converted all the layers to the Albers equal-area conic projection centered on Europe, for valid area calculations. In case you don’t know what I’m talking about, I’ll zoom out, and add graticules and Tissot’s indicatrices, which illustrate the nature of the projection’s distortions of shape, area and distance as you move away from the European center (i.e. the main focus of the projection):
And in case you wanted my opinion, projections are really annoying to work with. But there’s still room for improvement here: if I could get SpatiaLite to work in QGIS (damn shapefiles saved as SpatiaLite layers won’t retain the geometry), I would be able to re-project layers on the fly with a SQL statement, rather than saving them as separate shapefiles.
- I’m still playing around with symbology, so I went with basic shape+color symbols to distinguish battles from sieges (rule-based styling). I did a little bit of customization with the labels – offsetting the labels and adding a shadow for greater contrast. Still plenty of room for improvement here, including figuring out how to make my timechart symbols (created in Illustrator) look good in QGIS.
After discovering the battle site symbol in the tourist folder of custom markers, it could look like this, if you have it randomly-color the major states, and include the 100 French battles that David Potter mentions in his Renaissance France at War, Appendix 1, plus the major combats of the Italian Wars and Valois-Habsburg Wars listed in Wikipedia:
Boy, there were a lot of battles in Milan and Venice, though I’d guess Potter’s appendix probably includes smaller combats involving hundreds of men. Haven’t had time to check.
- I used Euratlas’ topography layers, 200m, 500m, 1000m, 2000m, and 3500m of elevation, rather than use Natural Earth’s 1:10m raster geotiff (an image file with georeferenced coordinates). I wasn’t able to properly merge them onto a single layer (so I could do a proper categorical color ramp), so I grouped the separate layers together. For the mountain elevations I used the colors in a five-step yellow-to-red color ramp suggested by ColorBrewer 2.0.
- I saved the styles of some of the layers, e.g. the topo layer colors and combat symbols, as qml files, so I can easily apply them elsewhere if I have to make changes or start over.
- You can also illustrate the alliances for each year, or when they change, whichever happens more frequently – assuming you have the time to plot all those crazy Italian machinations. If you make them semi-transparent and turn several years’ alliances on at the same time, their overlap with allow you to see which countries switched sides (I’m looking at you, Florence and Rome), vs. which were consistent:
- Plotting the march routes is also a work in progress, starting by importing the camps as geocoded points, and then using the Points2One plugin to connect them up. With this version of Charles’ march down to Naples (did you catch that south-as-down metaphor?), I only had a few camps to mark, so the routes are direct lines, which means they might display as crossing water. More waypoints will fix that, though it’d be better if you could make the march routes follow roads, assuming they did. Which, needless to say, would require a road layer.
- Not to mention applying spatial analysis to the results. And animation. And…
More to come, including the exciting, wild world of data collection.
At the end of 2017, I’m able to catch my breath and reflect back on the past year. It was a digital year, among other things.
Most concretely, our History department’s Digital History Lab was finally completed. Two long years of planning and grant-writing, and almost 800 emails later, my quixotic labor of love is (almost) done! A generous anonymous donor gave us enough money to find a room one floor above our offices, and to find the money to stock it with PCs and iMacs, a Surface Hub touch-display, scanners (including a microfilm scanner and a ScannX book scanner), and a Surface Book tablet/laptop to pass around the seminar table and project to the Surface Hub. These tools will allow our undergraduate department to use the lab for a variety of projects: digital-centric history courses and digitally-inflected courses; independent studies and tutoring; faculty projects and internships; as well as public history projects with local museums. Not to mention the Skype-enabled Hub.
In the process of designing and overseeing the lab’s construction, I’ve learned a lot about institutional paranoia and the rules they necessitate, and how the digital humanities’ love of open-source software doesn’t play well with IT’s need for locked-down systems. So the lab had to forego many of the open-source tools used by digital historians and humanists. But I did try to provide the computers in the lab with commercial programs with similar features. The software includes:
- ABBYY FineReader for OCRing texts
- the standard Microsoft Office suite (including Access for relational databases)
- the standard Adobe Creative Suite, including Illustrator
- statistics software (SPSS and Minitab)
- EndNote (because we can’t install Zotero)
- Aeon 2 timeline software (for semi-interactive timelines like this)
- mapping software, including Google Earth Pro, ArcGIS, QGIS, Centennia and Euratlas historical digital maps, and MAPublisher to tweak geospatial data in Illustrator.
- OutWit Hub for web scraping and tagged entity extraction
- online software, such as Google Fusion Tables, Palladio, Voyant, etc.
- the machines also have Python, but I’m not sure about how easy it will be to constantly install/update new libraries and the like, given the school’s security concerns
- the department also has a subscription to Omeka, for our planned public history projects.
And there’s more to come. The anonymous donor made an additional donation which will allow us to replace that retro chalkboard with a 90″ monitor display. As well as purchase a few other software packages, and even a reference book or two. All the tools you need to do some digital history. And build a digital history curriculum for our undergraduate majors.
The DHL will be the centerpiece of our department’s new foray into digital history. Since we’re an undergraduate institution, our goals are modest. Having just taught the first iteration of my Introduction to Digital History course, it’s pretty clear that having undergraduates mess with lots of open-source package installations – much less try to learn a programming language like Python – would’ve been a nightmare (especially since I’m just learning Python myself). So our textbook, Exploring Big Historical Data, didn’t get as much use as I’d initially planned. But we did spend some time looking at the broader picture before we dove into the weeds.
And to make sure the students understood the importance of kaizen and the “There’s gotta be a better way!!!” ethic, I beat them over the head with the automation staircase:
As a result, the students were introduced to, and hopefully even learned how to use at least a few features of, the following tools:
- Adobe Acrobat automation
- Excel (don’t assume today’s college students know how to use computers beyond games and social media)
- MS Access
- OCR (ABBYY FineReader and Adobe Acrobat Pro)
- Regular expressions
- Google Sheets and ezGeocode add-in
- Google Fusion Tables
- Stanford Named Entity Recognition
- OutWit Hub
A digital smorgasbord, I realize, but I tried to give them a sampling of relational databases, text mining, and mapping. Unfortunately, we proved again and again that 60%-80% of every digital project is acquiring and cleaning the data, which meant there wasn’t as much time for analysis as I would’ve liked. And, to boot, several of the tools were extremely limited without purchasing the full version (OutWit Hub), or installing the local server version on your own computer (Stanford NER) – did I mention students had problems installing software on their own machines? But, at the least, the students were exposed to these tools, saw what they can do, and know where to look to explore further, as their interests and needs dictate. I’d call that an Introduction to Digital History.
Fortunately, I was able to play around with a few more sophisticated tools in the process, relying on the Programming Historian, among other resources:
- Vard 2 and GATE (cleaning up OCRed texts)
- MALLET topic modeling
- Gephi network software (Palladio also has some basic network graphing features)
- VOS Viewer for bibliometrics – if only JSTOR/Academic Search Premier/Historical Abstracts had the bibliometric citation datasets that Web of Science does (yes, JSTOR’s Text Analyzer is a start, but still…)
- Edinburgh geoparser
- Python (also with the help of Automating the Boring Stuff with Python).
So now I’ve at least successfully used most of the tools I see digital historians mention, and have established a foundation to build future work upon.
So, what are my resolutions for 2018?
More of the same, but applied toward EMEMH!
More digitalia – adding a few more toys to Eastern’s Digital History Lab, training the other History faculty on some of its tools (Zotero and Omeka, for starters), and practicing a bit more with GIS. And figuring out a way to efficiently clean all those 18C primary source texts I’ve got in PDFs. And, just as mind numbing, creating shapefiles of the boundaries of early modern European states.
More miltaria – I’m teaching my European Warfare, 1337-1815 course again this Spring, and will try to figure out a way to have the students’ projects contribute towards an EMEMH dataset that will eventually go online.
And did I mention a year-long sabbatical in 2018-19, so I can finish the big book of battles, and start the next project, a GIS-driven operational analysis of Louis XIV’s campaigns? Yeehaa!
So here’s to wishing your 2018 might be a bit more digital too.
So now I have to add another letter to the abbreviation – Early Modern European Military Digital Historian. We are approaching LGBTQIA territory here – except narrowing instead of broadening.
And who leads the pack in this exciting sub-sub-sub-subfield? For my money, it would be Spanish scholar Xavier Rubio-Campillo, who’s already published an article using GIS for early modern siege reconstruction (Barcelona 1714), which I highlighted here several years back.
Now he’s applying computer modeling to early modern field battle tactics, during the War of the Spanish Succession, ‘natch: “The development of new infantry tactics during the early eighteenth century: a computer simulation approach to modern military history.” To reproduce his abstract from Academia.edu:
Computational models have been extensively used in military operations research, but they are rarely seen in military history studies. The introduction of this technique has potential benefits for the study of past conflicts. This paper presents an agent-based model (ABM) designed to help understand European military tactics during the eighteenth century, in particular during the War of the Spanish Succession. We use a computer simulation to evaluate the main variables that affect infantry performance in the battlefield, according to primary sources. The results show that the choice of a particular firing system was not as important as most historians state. In particular, it cannot be the only explanation for the superiority of Allied armies. The final discussion shows how ABM can be used to interpret historical data, and explores under which conditions the hypotheses generated from the study of primary accounts could be valid.
Link at https://www.academia.edu/2474571/The_development_of_new_infantry_tactics_during_the_early_eighteenth_century_a_computer_simulation_approach_to_modern_military_history?auto=download&campaign=weekly_digest. Though it may require a subscription.
Maybe someday we military historians will collectively set our sights a little higher than tactics (note the military metaphor), and a little lower than grand strategy? Though, admittedly, that’ll require a lot of hard work at the operational level of war. And maybe even a better sense of what we call these different levels.
Seriously though. I’ve known about the concept of ‘regular expressions’ for years, but for some reason I never took the plunge. And now that I have, my mind is absolutely blown away. Remember all those months in grad school (c. 1998-2000) when I was OCRing, proofing and manually parsing thousands of letters into my Access database? Well I sure do.
Twenty years later, I now discover that I could’ve shaved literally months off that work, if only I’d adopted the regex way of manipulating text. I’ll blame it on the fact that “digital humanities” wasn’t even a thing back then – check out Google Ngram Viewer if you don’t believe me.
So let’s start at the beginning. Entry-level text editing is easy enough: you undoubtedly learned long ago that in a text program like Microsoft Word you can find all the dates in a document – say 3/15/1702 and 3/7/1703 and 7/3/1704 – using a wildcard search like 170^#, where ^# is the wildcard for any digit (number). That kind of search will return 1701 and 1702 and 1703… But you’ve also undoubtedly been annoyed when you next learn that you can’t actually modify all those dates, because the wildcard character will be replaced in your basic find-replace with a single character. So, for example, you could easily convert all the forward slashes into periods, because you simply replace every slash with a period. But you can’t turn a variety of dates (text strings, mind you, not actual date data types) from MM/DD/YYYY into YYYY.MM.DD, because you need wildcards to find all the digit variations (3/15/1702, 6/7/1703…), but you can’t keep those values found by wildcards when you try to move them into a different order. In the above example, trying to replace 170^# with 1704 will convert every year with 1704, even if it’s 1701 or 1702. So you can cycle through each year and each month, like I did, but that takes a fair amount of time as the number of texts grow. This inability to do smart find-replace is a crying’ shame, and I’ve gnashed many a tooth over this quandary.
Enter regular expressions, aka regex or grep. I won’t bore you with the basics of regex (there’s a website or two on that), but will simply describe it as a way to search for patterns in text, not just specific characters. Not only can you find patterns in text, but with features called back references and look-aheads/look-backs (collectively: “lookarounds”), you can retain those wildcard characters and manipulate the entire text string without losing the characters found by the wildcards. It’s actually pretty easy:
Yep, it’s been a computational summer. Composed mostly of reading up on all things digital humanities. (Battle book? What battle book?) Most concretely, that’s meant setting up a modest Digital History Lab for our department (six computers, book-microfilm-photo scanners, a Microsoft Surface Hub touch display, and various software), and preparing for a brand new Intro to Digital History course, slated to kick off in a few weeks.
I’ve always been computer-curious, but it wasn’t until this summer that I fully committed to my inner nerdiness, and dove into the recent shenanigans of “digital humanities.” Primarily this meant finally committing to GIS, followed by lots of textual analysis tools, and brushing up on my database skills. But I’ve even started learning Python and a bit more AppleScript, if you can believe it.
So, in future posts, I’ll talk a little less about Devonthink and a bit more about other tools that will allow me to explore early modern European military history in a whole new way.