Digitize those Sources!

I was just scanning in some old journal article and book chapter photocopies (bowl season just around the corner) and converting them into OCRed-text PDFs. This is useful not just for back-up purposes, but for searchability as well. You can get relatively cheap office printer/copier/fax combos that come with an Automatic Document Feeder to feed in multiple pages automatically (like the name says). Assuming you have a recent version of Adobe Acrobat Pro (your university may have a cheap or even free license available), you can search across any number of PDF files for full words or ‘stems,’ and they say it’ll even do proximity searches. Best of all, it can save the search results to a single PDF file, showing the context of the term with links to the original! I’ve been wanting to do this kind of thing for 15 years – I was actually OCRing 1,000s of pages of primary sources back then (don’t want to think about how many days of work has been replaced by Google Books…), and I almost spent $300 on an ADF for my e-bay purchased scanner. Now with the confluence of digital text and search, an easy way is finally here. Awesome. Real digital humanists use open-source software that usually requires some programming skills, but I’ll go with what I’ve got for now. In addition to my database that I’ve had for a dozen years or more.

Adobe Acrobat search results sample

I was also downloading PDF articles from the library’s databases, and discovered that 1) War in History has had a good run of early modern articles over the past several years, and 2) Brill is the new publisher for the International Commission of Military History‘s Bibliography, and currently has the 2011 online edition for free viewing/download. Check it out.



3 responses to “Digitize those Sources!”

  1. jostwald says :

    Another handy idea I just had is to add your notes, reviews, etc. as extra pages at the end of the PDF – recent versions of Word allow you to save the files as PDFs that can then be attached as pages. That way the source and its commentary are all together in one place, and searchable.

  2. Gene Hughson says :


    You might also want to check out EverNote (no connection, other than satisfied customer). I wrote up a mini-review on my blog. The same features that make it useful for me would likely help you as well. Depending on how much you’re cataloging, you might be able to get away with a free account. Even the paid one is extremely reasonable given the features.

    • jostwald says :

      Thanks for the tip. I actually have been using Evernote (also no connection!) for the year+ that I’ve had my iPad; love the syncing ability with my desktop. Sometime I should write up an overview of my note-taking process, so people can compare notes. The only thing I don’t like about Evernote (for the iPad) is that it doesn’t work offline, and I don’t always have the 3G subscription activated. Every so often I export all of my Evernote notes into a PDF format that can be searched by my iPad’s GoodReader app.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: