Archive Lessons Learned
Back from my month in Paris, almost twenty years after my last trip to the SHD war archives at Vincennes. Over the past month I spent 17 days in the archives – four potential research days were wasted because I overestimated how long it would take me to go through the volumes – damn me and my efficiency! Those interested in what I learned (and relearned) can read the Little and Big Pictures below.
The Little Picture
Before the trip, I had organized Devonthink so that it would be easy to keep track of which volumes I needed, which I’d looked at, etc. I’d already set up separate groups (tags, technically) for each volume, and had parsed the OCRed inventory description of each (as described in an earlier post) as well. I’d also gone through and highlighted (set label color) for the volumes I’d ordered, and also for specific documents that I wanted to look at. If I knew how many pieces (or folios) were in a volume, I made placeholder RTF records for each individual document. [Process: in Excel I used autofill to create a separate row for each document number (e.g. 1706 A1 1834 #1 for row 1, 1706 A1 1834 #2 for row 2… – technically, I made a separate column for the document number and concatenated and then Paste Special as Value). I saved the resulting list as a text file, dragged it into the appropriate DTPO tag, then converted it to RTF, and then ran the Applescript macro Explode by Paragraph, which split each row into its own RTF file, placing the title in the Spotlight Comment and the main body… All this took a few minutes per volume.]
Then I marked (label with a color) any specific documents that I wanted to focus on, even describe them in the file name, based off of the table of contents in the beginning of each volume if I had photos of that. [After checking in published sources to make sure there weren’t a lot of documents already transcribed.]
This, then. gave me an indication of which documents in a volume to photograph, or whether to just photograph the whole volume (which I only did for a few, mostly because it gets really boring, and takes up an entire day). On the left, you can see the old-school master list of volumes I used as well.
I also created a couple of smart groups so I can keep track of which documents I still want to look at in the future (To Get), as well as those which I have, but now need to transcribe. I’d usually have both of these windows open at the same time, so I could jump back and forth as needed. Note as well the easy-to-access Favorites group in the far left pane.
(Confession: By the last week, I just wrote down the document numbers to photograph on a pad of paper for a few of the volumes.)
Thus armed, I went into the archives with camera (extra battery), laptop and pad of paper in tow. Overall, I managed to look at 22 volumes (one was missing), and took 12,000 photos. This averages out to 700 pictures per day, though in the future I’ll try to spread the distribution out more evenly. My peak photo days included one day of 1300 photos, and the last day I somehow managed 1900 exposures (reminder to self: always top off your battery, even if you don’t think you need to!). My camera: a 14.1 megapixel Panasonic DMC-FH20 with a small sensor (1/2.33″ CCD); the SD card (one 8 GB, the other 2 GB) is rated at 20 MB/s. I used a moderately-sized 3 MP JPG file size with ISO 800 and anti-shake on, with one of the macro settings (forced flash off, of course). Each image was about 1.5 MB in size, and after each volume was completed I’d copy the images from the SD card to my MacBook Air in a separate folder for each volume.
The physical photographing process consisted of first zooming the microfilm lens in on the image to make it as large as possible (while making sure that you wouldn’t have to waste time zooming back out for a larger document). Then framing the image in the LCD screen while bracing against the side of the microfilm reader – manually raising or lowering the camera depending on how large the original page was (rather than messing with the microfilm reader’s zoom and focus for each image), and trying (unsuccessfully) to avoid the big glaring spot from the projector bulb. Then I auto-focused on the microfilm image, and waited several additional seconds for the exposure and the JPG to save to the SD card. Advance to the next image (one page per photo so I can zoom in as needed, and otherwise keep the zoom setting at the same level when paging from image to image). Rinse. Repeat. Always repeat.
I probably could have taken even more photos overall, but with my 5-year-old point-and-shoot camera I only managed at most 6-7 shots per minute. Some other researchers seemed to be much faster: they tended to shoot either with an SLR or their smartphone (mine was filled with music files), and were usually sitting down, while I was standing up to take shots from above. I tried to shift my awkward posture every now and again – a swiveling LCD screen would’ve helped. Conceivably I could have alternated where I positioned the image on the screen, sometimes on the left side of the reader screen and other times on the right. But the light from the windows and overhead were always brighter on the right-side of the screen, plus I had a wall on the left that I could lean my shoulder against to steady the camera.
According to my camera-savvy wife, my results were mediocre not only because I lacked the latest equipment, but my microfilm (unlike the others around me) was negative instead of positive, which meant that the image was relatively dark, which necessarily requires that the shutter be open longer in order to collect enough light for an image. And, of course, the longer the shutter is open, the more chance for blur (no tripods allowed). Hence my photos were rarely as clear as the microfilm image I was looking at, much less what an original document would look like. To compare two images, the first from the only original document I looked at, and then from one of the microfilm images:
And then a negative microfilm sample:
Both are readable, but the microfilm photos require a bit more effort, which is only compounded with the challenging handwriting of some authors – I’m looking at you two, Chamlay and Chamillart!
I’ve just about finished rotating the images – I automated almost all of it by creating a Photoshop Action to batch rotate. And it turns out it’s a good thing I didn’t alternate the position of my shots after all. Since almost all of the images were taken from the left with the camera held in the same orientation, I can just set the Action to rotate each image +90° and then change the few incorrect ones when I come across them. I probably won’t take the time to further clean up the images (crop, invert, sharpen…) except on an individual basis as needed, since the image sizes and margins vary from photo to photo, i.e. there’s no way to determine which edits are needed without looking at each of the 12,000 images.
After I combine the volume photos into PDF files (possibly parse by month), the result is:
Import the PDF into the appropriate DTPO source tag, add the appropriate metadata, use the ‘New RTF from selected text with link back to PDF page’ Applescript to create linked transcriptions with topical group tags, and Bob’s your uncle.
The Big Picture
More generally, in the process of researching at SHD I relearned the broader lesson that there are the official rules, then there are the unofficial rules. Depending on the institution, national culture, and individuals, the gap between the two may be wider or narrower. The British Library and the Bibliothèque nationale, for example, have figured out how to monetize and streamline their document delivery systems. Not as much for a smaller archive like the SHD. So the biggest take-aways were things that one already knows if they’ve studied (or worked) in a bureaucracy, but can be surprising (even frustrating) to those who’ve acclimated to the anonymous, hyper-efficient, profit-driven and customer-centric digital culture increasingly prevalent in the Anglo world:
- Since the SHD is on a military base, which also happens to be a popular tourist location (the donjon right next to the 19th century Pavillon du Roi was the royal palace back in the 14th century and later imprisoned people like the Marquis de Sade), you need to figure out which “Interdit au public” signs to obey and which to ignore. Especially now that France is vigilant for pirates (Vigipirate!), you are greeted by security at the entrance gate, requiring you to show you reader’s card or email about the archives for entrance. Then walk past the donjon and venture through the arch (far left in photo) that says “Interdit au public” – because that sign is intended for the touristing public, not the archiving public – and head to the Pavillon du Roi, where you once again obey all the Interdit signs. It is a military base (sort of), after all.
- It really helps to know people, people on the “inside” in particular. So, hypothetically speaking, if you’ve had trouble getting microfilm ordered in the past, it’s amazing how fast the process can go if you know somebody to help it along – a “no” answer that may have taken a month or more in the past can be easily answered “yes” within a week if you have an intercessor. So looks like I’ll be getting some microfilm after all. I’m told it apparently is even more helpful in a military-run archive if you have high-ranking foreign military officers writing on your behalf. And providing gifts and taking people out to lunch/coffee isn’t unheard of either.
- Explicit rules might only be guidelines, or have important caveats, though the precise internal workings will probably remain enigmatic to the “end-user” (I’ll avoid the customer metaphor). Sometimes you might get requested volumes earlier than the expected date, sometimes not. The explicit 5 volumes per day limit, I was informed during my last week there, only applies to the original documents; there is no theoretical limit to the number of microfilm volumes you can order each day. Less important examples: the staff didn’t seem to care that some researchers kept their camera <beep> sound turned on (despite the website clearly telling them to turn it off), or that one inconsiderate researcher even took a phone call. (And if you’re easily distracted, you won’t like the microfilm readers being located in the library section, which also hosts occasional staff chit-chat and reference questions being fielded.)
- Some information that would help the end-user still requires personal contact. Knowing that there is no set limit to the number of microfilm volumes you can order in a day is very helpful – especially if you spread out your requests across every day, rather than group them like I did. But only if you know which volumes are microfilm when you order them several weeks in advance. How do you find that information out? Is there some convenient place on the website, or some finding aid that will tell you? No, you have to ask. As a helpful staff member replied when I queried how one was to know which volumes were which:
“Why that’s easy! Which sous-série?”
“Oh, they’re practically all on microfilm, probably 95% or more.”
So now we all know.
Other minor, procedural matters worth keeping in mind:
- Ordering a second time. Pay attention to the confirmation email about each order. If the email says that some microfilm documents are available “sans réservation préalable,” this really means that when you first go in you need to fill out a paper form with the président de salle before you can pick them up. The email doesn’t say which volumes require that form, so I just included all of the volumes for that day. It might take just a few minutes of waiting before the volumes are ready, or it might take half-an-hour or more. After you’ve done that once, you can then place the remaining items on reserve for the next day as you normally would. [Though now I wonder if that email means you don’t need to wait 2 weeks to order those volumes? Hmmm….]
- Placeholding. If you think the microfilm machines will be in demand, you might want to place some of your materials at one of the machines before you queue up at the guichet window to get your items. This might be particularly useful if you need to wait for your items.
- Keep whites separate from colors. As is the case just about everywhere, you need to look at any original documents you’ve ordered first, and only when you’re done with those can you then look at the microfilm.
- Give me a break! With the exception of one day, I generally skipped lunch and just had one or two short 5-10 minute breaks to snack on something near the locker area. (Sidenote: the bathrooms have a fancy Dyson Blade hand-drier; too bad they don’t have a microfilm scanner. Second side note: if you find yourself accumulating lots of coins, the vending machines are a good way to get rid of your excess change, 5 cent(ime) pieces and up.) What one is to do with one’s archive materials while on break (or “pause”) wasn’t totally clear, so I just left my microfilm at my microfilm desk (with one on the reel to make it clear it was in use) and left a pad of paper with my place/seat card there (the seat number card will set off the alarm if you try to leave with it in hand). This seemed to work fine: even when I was at lunch for 90 minutes, there wasn’t a problem when I came back. I did, however, always take my MacBook Air and camera with me, rather than trust my fellow archive rats. Some of the desk staff would want to check your transparent plastic bag every time you left; others didn’t care. Whether you were supposed to wait to get checked out when running to the little historian’s room was another one of those procedures that seemed to vary depending on who was sitting at the desk. Generally, once they know you, the rules seem to relax a bit.
I’m not sure if you’d need to return the original documents (vs. microfilm) to the guichet window when you pause or not. Yet another one of those questions that there don’t seem to be solid answers to – or the answer may vary depending on whom you ask.
As you can probably tell, my inherent preference for explicit, universally-applied rules was disappointed again and again. Though I did overhear an amusing argument between the président de salle and a researcher over how the researcher was to fill the form out in pen when pens weren’t allowed in the reading room.
- Backup, backup, backup. It was really nice to have wifi in my flat, which let me not only email and surf the web (and even listen to a few podcasts), but also allowed me to upload each day’s photos to Dropbox. Of course uploading hundreds of photos took hours and hours, but I didn’t have to do any of the lifting.
So that’s what I did last month. Now I just need to read through all 12,000 photos. And a few others.