Digitize your history, man

Still going strong on notes, largely because I’ve got a lot of research/teaching projects I’m working on for the next month plus. But I will return to EMEMH content soon.

That being said, I might as well summarize why I decided to scan all my photocopies of secondary sources as PDFs. It’s the cool thing to do these days, at least among the whippersnappers. Of course it also helps if you have students workers, or at least a scanner/copier with an automatic document feeder.

Why PDFs? It’s not that I’m in love with PDFs, but they are the standard, whether it be how JSTOR et al deliver journal articles, or as the standard format for any interlibrary-loaned articles and book chapters. Many digital humanists prefer plain text files as the most transparent file format, but ya gotta start somewhere, and most often that means starting with ol’ Adobe Acrobat.

For EMEMHians, it’s a bit more challenging than simply downloading journal articles from JSTOR and other websites, since there are relatively few born-digital works on the subject. A few articles scattered here and there certainly, but far more book chapters and books, neither of which tend to be digitized already. So that means you ideally have access to a new-fangled photocopier that also work as a scanner (handy because they scan at photocopier speed, rather than the slower scanning speed of your average desktop flatbed scanner). Did I mention these scanners tend to render their output in PDFs?

But is it worth it? I’d say so. The advantages of having all your secondary sources are many-fold (especially if you have Adobe Acrobat Pro):

  1. Backup. Especially if you store them off-site or in the cloud, you don’t have to worry about your work going up in smoke, or washing out to sea, or whatever regional natural disaster metaphor you can think of. It also allowed me to reclaim floor space in my home office as I moved the three big file cabinets out into the garage, my off-site storage facility if you will.
  2. Portability. Put them on Dropbox or a cloud drive and you can access your secondary sources anywhere with an Internet connection. You can also load the PDFs of your choice on an iPad, laptop, or e-reading device for couch surfing, conference travel, or if you’re one of those types who likes to hang out in coffeehouses.
  3. Full-text searchability. This was the most immediate reason why I started the scanning process during my sabbatical last year. This works best for secondary sources, and requires that they be OCRed obviously. Full-text search, baby.
  4. Future plan I. Analyze the various full texts with software: word frequencies, clouds, keywords-in-context, collocation…
  5. Future plan II. Analyze ‘networks’: who cites whom, which works and authors use which terms, which sources, etc.
  6. Future plan III. In the next post I’ll describe how all these PDFs should also make it much easier for me to take old-fashioned notes on all those secondary sources.

Other advantages/disadvantages?


Tags: ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: