The theory behind historical visualizations

A comment by a reader on a previous post (contrasting my Ottoman timechart with Minard’s famous map of Russia 1812) merits further discussion. Warning: theoretical discussion of the visual display of (not just) quantitative information follows.

Napoleon's march into and out of Russia, 1812 (Minard)

Napoleon’s march into and out of Russia, 1812 (Minard)

Minard’s map is considered successful because it makes a *very* simple and focused argument using high-information variables. Let me explain.

The map only includes two explicit data points: spatial location and army size (three if you throw in his temperature graph). It provides only a general sense of time, not delineated in any way: we don’t know how long Nappy’s army was at any one place, e.g. did they take the same amount of time to march to Moscow as back? Can’t tell from the map. Thus I really wouldn’t call time an explicit data point. Frankly, any visualization of two variables should be easy to read.

The Minard map is also easy to understand because its two main variables are the most easily visualized (i.e. graphable) type of data: ratio data of numerical quantity. Size, in number of men, is a continuous variable that can be graphed along a line (or by width of a line in this case). The same is true for geographical location (think in terms of latitude-longitude if you like) – a route (line) drawn on a map. Ratio data is easily graphed because each value has a specific numerical ratio to every other possible value: 10,000 men are half the size of 20,000 men, and five times as large as 2,000 men. Thus you can conclude that a segment on a line that is twice as long (or wide) as another segment of line is twice the value. Temperature is also continuous, providing a third, easy-to-read, visualization.

Even more, the map is constructed to be simple because it is intentionally constrained to one campaign, one theater, one single army. It doesn’t pretend to show what happened in 1811, or in Spain, or anything else. It doesn’t even tell us whether this type of attrition was normal or abnormal for a Napoleonic campaign. Minard is useful for 1812 Russia regarding the Grande Armée’s army size and march route, but that’s it. And it’s success comes as much from the careful choice of case study as its design: 1812 Russia is an unrepresentative example to serve as a model for army size maps. Minard can get away showing so little data because practically every reader who will view it can fill in the missing context. Many readers understand the map intuitively because they know much else about that campaign and can fill in the blanks. But imagine if I created a map with the exact same information (army size across space), but dealing with the 1705 campaign in Italy… How much ‘sense’ would that make for most readers, with the reader unable to fill in the missing info on the map with their own knowledge of the campaign? Not nearly as effective, because 1812 was so anomalous.

It also really helps if the trend you are illustrating also follows a simple plot arc: marching there and back. Napoleon’s Grande Armée marches into Russia, losing troops along the way, makes a few minor detachments, captures Moscow, retreats from Moscow losing further troops, returning (conveniently for the designer, along a different route, so you don’t worry about the lines overlapping) with but a fraction of his original force. This is a very simple narrative, contrasted with many other military campaigns where the initiative swings back and forth, territories are won and then lost and then won again, army sizes grow and shrink according to losses and reinforcements, multiple detachments and reinforcements depart and arrive… (see my attempt to illustrate this broad territorial trend for Spain in the WSS). Much of this usually occurs on exactly the same terrain across multiple years, making it very difficult to avoid piling symbols on top of each other (hence the cartographer’s beloved callout). All that complicates a visualization. On the other hand, Minard’s goal was very narrow and he choose a very simple story to tell, so the illustration could be (somewhat) clear, precise and still be easy to interpret at a glance.

In short, Minard’s map is “effective” because it’s incredibly simple and only tells us a very basic narrative that, frankly, most viewers already knew from their past knowledge of 1812: Napoleon’s massive army was destroyed in Russia, and it was really cold. You don’t need to read the text (aside from place names, army sizes, and temperatures) because no other information is presented. Any visualization has to be effective and successful for a particular purpose. If you have a very narrow goal (show the consistently-shrinking size of a single army tracked along its march route over a single campaign in a single country), you can have a simple visualization like Minard’s.

Analytical maps vs. Reference timecharts

Making a specific argument visually is not the point of a (reference) timeline/chart – the relevant comparison is with a text timeline you find in most history textbooks. It’s purpose is to serve as a reference tool, not to make a specific argument (although you may be able to do so from the data it presents). Like a general road atlas or country base map, a reference timeline/chart should include information for different questions a user might ask it – which means it includes a lot of information (i.e. each datapoint doesn’t necessarily relate to every other datapoint), and it doesn’t necessarily make an explicit argument (beyond the designer’s decision to include and exclude certain things).

I use my time charts to recreate the historical narrative across decades, as well as to refer back to (i.e. for reference) as I’m reading about the topic, etc. The timechart idea began when I started prepping for teaching about the same wars (complicated wars without a clear trend) in multiple semesters, but didn’t want to have to reread the same 100+ pages of narrative (more accurately: read through the five books that give pieces of that narrative) every semester. I then realized I could display it as a PowerPoint slide in class and narrate off of it, as well as give students a copy for their own use. I will never remember all the events on my own, so why not put them on a cheatsheet? Thus it not only provides evidence that can be used to make an argument (evidence that was difficult to extract in the first place), it also forces students to appreciate that history is about generalizing and forgetting lots of information when constructing a narrative.

All that being said, the symbolism I use is largely fixed, but the macro level is still evolving; I”d like to figure out ways to simplify these for an overview.

Ottoman Wars timechart

So the Ottoman timechart is a very different type of visualization than Minard’s chart – it has a different purpose. It doesn’t do what Minard’s does, but Minard’s map doesn’t do a lot of things that my visualization does, and frankly Minard doesn’t do some of the things that it should do (I’ll get to that in a bit).

For a start, note that my timechart covers six main geographical regions instead of one, showing the (almost) full geographical context of the Ottoman wars. To contrast this with Minard, one might ask whether anything else was going on elsewhere in Europe in 1812. If Napoleon had a million men elsewhere, for example, losing 400,000 wouldn’t have been as big a deal. Or if Napoleon could easily raise another million men, the map’s message is diminished. In other words, Minard doesn’t provide any larger context for his map. As mentioned earlier, it doesn’t need to because the viewer likely knows the context already and because that’s not its purpose. But that also means Minard has an easier task so his map should be easy to understand: it’s a simple story.

My timechart covers 42 years instead of one. One might ask of Minard, was 1812 the beginning and end of Napoleon’s reign and campaigns? Were these losses that much more significant than in other campaigns? We can find the answers elsewhere, but Minard’s map won’t help us. That’s not Minard’s fault, because his purpose was very focused on a specific argument about where (not when, not why) the Grande Armée saw changes in its size.

Since the timechart is for reference, and since it covers a far broader geographical and chronological scope, it must necessarily work at a much more general level than Minard’s map: the grand strategic over decades rather than operational level in 1812 Russia.

Another significant difference between the two is the number and types of data visualized. Note that my timechart has six variables instead of two or three:
1. Specific year.
2. General spatial location (middle columns), and specific location of events (text name of place).
3. Who the Turks are fighting (color pattern of cell, and summary column next to years).
4. Type of event (battle, siege, raid, naval, revolt, invasion, treaty..).
5. Who won each combat (orientation of battle and siege icons).
6. For sieges: who the garrison was (inside fortress color) and who the attacker was (outside fortress color).

Here it’s also important to understand conventions for the visual display of information – what types of information can be effectively presented with which types of symbols. (In addition to Tufte, check out Jacques Bertin, and Mark Monmonier’s Mapping It Out). We need particularly to appreciate that almost all of the time chart data (with the exception of year) is nominal data, the least information-rich kind. Specific countries cannot be ‘graphed’ along a line or given a greater area (is Spain twice the value of Hungary?), nor type of event (is a battle worth 5 points and a siege worth 1?).  Thus linear (say the width of a line or shape of a line on a graph) and area data cannot be used. Hence my reliance on icons’ location, color, shape, orientation… Nominal data means that there is no continuous trend to easily display and read: the data is not comparable in a quantitative sense. It’s a function of the nature of the data. This blog post has a good chart that shows what each symbol type – point, line, area – is best able to portray.

In other words, you could fit one of Minard’s maps into a single cell of my time chart, because they are operating at very different scales, with vastly different amounts (and types) of information. It’s not a surprise that my timechart narrative is much more complicated and not as easy to read.

Ottoman Wars timechart, 1505-1547

Ottoman Wars timechart, 1505-1547

That being said, there are some things you can learn from the timechart at a glance. First, you can find out when a particular event happened (made easier if you know which theater that event took place in). That’s the main goal of any timeline/chart. And it’s certainly more effective than the common two-column timeline that lumps all events (regardless of their location) together in a single column – lots more text to read there. Plus you don’t have to do much reading, since the only text are the place names. I’d guess Minard’s map probably has the same ratio of text to info as my timechart, the difference is that he has less data and his continuous data can be simply graphed.

Let’s assume you know how to read my time chart on a macro level – it is by its nature more complicated than Minard’s, but that also means there’s more info in it (‘data dense’ as Tufte would say). You could then notice at a glance that the Turks were fighting almost every year, and that this warfare became constant from the 1530s on, and that these later wars were also fought against multiple foes at the same time (all this indicated by the colored boxes in column next to the years). For simplicity sometimes I just collapse the visualization down to three columns: year, who fighting against, major events in that year – it makes it easier to put on a PowerPoint slide with other images, text… You also notice on the time chart that in the North Africa column Algiers was a major prize (attacked 7 different times). You notice that there were a bunch of short campaigns against Hungary in the early 1520s, whose successes (notably Mohács) then shifted into an attack on Austria (Vienna 1529 being the most famous example). The colors of the combat icons (e.g. the outside color of the star fortress siege icons, i.e. the besieger) generally tell you who’s on the offensive: the Turks in the Med in the 1530s; the Turks in 1543 in the Balkans, the various Portuguese raids in India and later in the Red Sea. You notice that the Turks had to deal with revolts (explosions in Middle East column), sometimes these revolts were supported by their Persian foes (Persion crown right after the revolt symbol). You also notice that some of these revolts (mid-1520s) broke out just when the Turks were off fighting in Hungary – a coincidence? You notice how, particularly in the Mediterranean, alliance treaties (hands shaking) quickly lead to more fighting. So you can notice lots of interesting things. There is a lot that the timechart doesn’t tell you, but it’s intended as a general reference, an overview, and its fundamental organizational metaphor is time (not space, as in a map, although it does somewhat combine the two).

History, especially over any length of time, is complicated. And, going back to the idea of the Whig interpretation of History, it’s important that historians constantly remind their readers of the uncertainty faced by contemporaries living through the events, of its complexity, its simultaneity. (Not to mention our own uncertainty about what actually happened.) Historians talk about all this in terms of ‘problematizing’ the past. So in part my visualization strategy is a philosophical choice: since the history of these wars is not simple, the visualization shouldn’t pretend that it was. (As it is, my timechart ignores a large amount of information – it’s always about selection.) If you’ve ever had a student ask “But why couldn’t contemporaries just foresee X and adjust?”, you know the need to show why historical figures lived in their own ‘fog of history’ (to steal Clausewitz’s phrase). Simplification of history requires eliminating most of the details, and eliminates any sense of how contemporaries experienced it, which is bad for comprehension. It doesn’t help us understand what really happened, much less why it happened. Such oversimplification further encourages us to look at the present in simplistic terms, which isn’t good either. Simplicity equals ease of reading. Simplicity does not necessarily equal understanding.

So if you wanted to explicitly and efficiently visualize any one of these specific points about the Ottoman Wars, you could create a separate illustration dedicated to visualizing that point, a Minard map of 1529 if you will. But then you’d need dozens of visualizations (which I’m not going to create), each telling a small piece of the whole. This timechart is about the whole. I use other PowerPoint visualizations for narrowing points, but I’m not yet at the stage where I can/want to create such visuals.

Minard Critiques

It’s worth mentioning that there’s been much discussion, and even revisions, of Minard’s map over the years. Unfortunately Minard’s map doesn’t really do a very good job at all of explaining why the losses happened, which is odd since that’s seemingly the whole point of the map. The temperature chart implies that the winter weather was the most important feature of the campaign, yet eyeballing the line widths shows that the main losses (from 400,000 down to 100,000 men) occurred before the French even arrived at Moscow, yet our attention is drawn to the decline from 100,000 to 10,000, after the campaign is already lost. What’s more important and interesting, losing 90,000 men in the last half of an unspecified timeframe (since the map doesn’t include chronology), or losing 300,000 men in the first half?

Because it is a very simple map focused on a single issue, there are many other questions not answered: why did Napoleon choose the route he did? Were the resultant losses because or in spite of Napoleon’s selection of the route? Was it terrain? Enemy fortifications? Logistical concerns? Why were those detachments made – in response to another (unidentified) threat? <Crickets chirping>

And where was the Russian army in all of this? They’re totally absent! They had nothing to do with these losses? Minard’s map is not your standard operational map which gives us some idea of why the actors did what they did, what conditions shaped their decisions. Minard’s map doesn’t even explain why the losses took on the pattern they did, except for implying that it was the cold winter in the later half of the campaign, and even there the coldest temperatures don’t seem to match up exactly with the greatest losses. Minard’s map is fundamentally emphasizing temperature and (with line color) the difference between the pre- and post-Moscow phases. The enemy doesn’t even merit an appearance, unless it’s Moscow itself. Perhaps that’s a comforting narrative for a French cartographer and his 19C audience, but we can do better.

Minard’s simplified map tells us nothing about any of this context. It’s easy to read because it says so little, and relies on our background knowledge to create a narrative. That’s a lot of effort spent to create a map whose whole point (a very simple one at that: the Grande Armée marched in, froze, marched back out) can be digested in a glance, and then it has nothing left to offer. I realize scientists use simplification a lot, but humans are a lot messier than nature. So was the 1812 campaign.

Personally, (in addition to small multiples or animation, and layers that could be turned on and off which of course is where computer visualizations are at now), I would tweak Minard’s map to:

  1. Show more precise chronology: show the march rates, e.g. white gaps on the army lines to indicate each week passed.
  2. Show combat icons to indicate when and where exactly the various battles were fought. More fundamentally, each time the army gets smaller, what causes those losses? Is it a battle, constant skirmishing, a massive cold snap, a brutal forced march, a food shortage, unhealthy camp conditions, a disease outbreak, what exactly?
  3. Have a way to indicate scorched earth and Cossack attacks (unless these were so constant that it wouldn’t make sense to illustrate them all – but if that’s the case why did the losses slow down after Smolensk?).
  4. Include the Russian forces more generally.

In short, I want the map to visually indicate why the losses occurred when and where they did.

In Conclusion

But if anyone has a suggestion for a more (or equally) efficient way to summarize decades of conflict on such scales, I truly would be interested in seeing it. Unfortunately my sense is that you would have to either massively simplify the information you provide (a possibility for some uses), or make dozens of different, specialized visualizations, one for each point you want to make. And you still have to deal with difficult-to-visualize nominal data.

I don’t know what Tufte would say (nor would I nominate him as the arbiter of all visual information), but looking at Tufte’s reimagining of medical charts is informative in more ways than one. It shows how simplistic Minard’s map is. Not only does Tufte’s streamlined patient chart include a lot of quantified (continuous) info, but it isn’t nearly as simple to read as Minard, because it’s telling a more complicated story.

Analysis requires having data to analyze. Reference time charts help construct that basic narrative.

Tags: ,

3 responses to “The theory behind historical visualizations”

  1. Erik Lund says :

    I would add another problem with the Minard representation. There is no way in hell that 400,000 individual, named persons men crossed the Niemen into Russia on 23 June 1812, and that 40,000 men of a subset of that group walked out on 14 December, leaving the remaining subset of 390,000 dead or prisoners in Russia.

    Any serious study of the campaign will undermine this simplification in radical ways. Without Schwarzenberg and MacDonald, there are not the nucleii of the Austrian and Prussian armies of 1813. Napoleon’s departure on 6 December was justified by the fig leaf of a replacement division coming up from the rear, Two more such divisions followed.

    It strikes me, then, as a mistake to say that Minard’s graphic lacks a thesis. It has a thesis, all the more powerful for being impossible to engage as an argument. (I’ve said the same with respect to representative cartographers.)
    That thesis is that it was the fault of weather and distance. The Russian role is suppressed. We also cannot see the failure of the Grand Armee as coming from the bottom up, a conclusion to which we would be forced if the graphic represented straggling to the exclusion of distance, weather and fighting.

    In the end, the choice of things to represent allows agency to be attributed to Napoleon alone. Both his soldiers and the Russians are objects to be operated on by the world-soul.

    The more general problem is that some theses are represented in modes that are very difficult to engage in an argument. Critical theory probably allows us an approach, but I think there’s a long way to go before “”History: The GUI” is ready for prime time..

  2. Averrones says :

    IIRC the temperature was added to the map not with first publishing but afterwards, and it was critisized for inconsistence with contemporary notes. That winter was relatevely warm and serious cold appeared mostly when the army has reached Smolensk, followed but rise of temperature. Really harsh winter came into force when 10000 remaining soldiers crossed Berezina river. That’s why nowadays that map is often used only for numbers of men – to demonstrate that winter had little to do with defeat.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: