Taking notes on secondary sources
I know I’m late to the party, but I’ve been playing around with taking notes on secondary sources. Not notes in the margins of the books/photocopies, not notes on notebook paper, not notebook notes that are then photographed and loaded into Evernote, not even notes on notecards, but real notes that are digitized (full-text) and fully searchable. In one respect secondary sources are a lot easier than primary sources because most of the secondary sources are either already in digital form, or are at least printed in easily-OCRed fonts (see my previous post if you’ve been living under an analog rock this past decade). So we need to take advantage of all those little zeroes and ones.
[I’ll leave aside the philosophical question of whether one remembers more from the physical practice of handwriting or typing. Handwriting probably does aid memory recall, yet it’s also moot when dealing with hundreds of secondary sources and tens of thousands of primary sources, particularly if your research project is spread out over years, and if you have significant teaching and service obligations to distract your mind as well. Even more, sometimes you’ll need to go back to the originals to look for terms and examples you hadn’t considered earlier (or exact phrasings that are only important later). These examples are, by definition, things you didn’t take notes on or even pay attention to at the time. Memory won’t help you there. Externalizing your knowledge is the best medicine.]
So the key, then, is to minimize the effort, while at the same time making the resulting notes easy to search. Ideally I want a combination of summary, paraphrase, transcription, keywords, and my commentary for each source. And I might as well put them (the notes at least) in the same place I already have all my primary source notes and transcripts, so I can do single searches with my familiar keywords and search terms, instead of multiple searches of varying granularity across various platforms.
Given how much I use my Access database (and its various keyword options), it would probably be best for me to keep the full-text of the secondary sources in the PDFs and only export the quotes and my notes into Access. The guiding principles for me are:
- To process and condense the information in note form rather than just rely on brute-force text string searches or simply paste an entire book’s text into my database. I can already do full-text search with the PDFs, and can open any PDF from within the database as needed (hyperlink field). Adding short keywords is the simplest way to categorize the information, but the real goal is to summarize the contents, so as to avoid rereading the same letter or passage each time I come across it.
- To get the processed textual info out of the 1,300 PDFs and into my Access database, where I can use the bazillion keyword fields that I already use for my primary source notes, and where I can combine all my various primary and secondary source notes into a single report if need be. After I’ve taken notes on/in them, the PDFs of the secondary sources should serve primarily as backups, or if I need to do a full-text search for unforeseen topics and terms. If need be, I can start with specific full-text searches of the PDFs and take notes on only those sections of the secondary sources, leaving a more systematic note-taking of each source for later on.
Too often online discussion of annotating PDFs assumes that you’ll keep going back to the PDF – I want to avoid that. And drawing squiggly lines and exclamation points in the margins of your text is great, unless you want to find that graffiti later on. You could of course abandon the PDF format altogether: save all PDFs as Word documents and then just copy and paste the various quotes and notes into your database. But since that would take a long time, and since you’ll continue to acquire new PDFs anyway, it seems we’re stuck with them.
So here are my thoughts on taking notes with PDFs. The following includes a lot of details for my own benefit, but they might be beneficial to you as well. In any case, you’ve been warned.
My tentative system to read and annotate PDFs on the computer in Adobe Acrobat 11 Pro (assuming the PDFs have a text layer and aren’t just images) is to simply highlight and annotate in the PDF and then export those annotations to my Access database. In more detail:
- Quotes from author: Underline and highlight sections of the text. Since this is a text layer, you can choose specific words.
- My comments/thoughts on what I’m reading: You can either Add note to text (Comment on Text) or add free-form notes (Sticky Notes in Acrobat parlance). These notes can be searched within the PDF, along with the normal text.
- The above notes and annotated text are automatically saved to comments, if you first make sure to set Preferences-Commenting-Making Comments-Copy selected text into Highlight comment pop-ups. (This isn’t retroactive, since the PDF text and comment data are apparently independent layers.)
- These comments can be viewed through the Comments List pane for any PDF (right-hand column in image above) – note how each comment has the text that was highlighted/underlined. You can search, sort and filter these comments as well.
- The highlighted and underlined passages, as well as the sticky notes, can also be exported to a separate PDF file, using Create Comments Summary. Choose to export Comments only – this will extract only the passages and comments you have made to the new PDF. Another option is to print the whole PDF with your comments included – but printing out the PDF again would be defeating the purpose.
- The text of this Comments Summary PDF can also be exported (via Save As… Word, or else by simply selecting the text, Copy-Paste) to other programs, such as my Access database. If you do that, make sure that you distinguish transcript from notes somehow, perhaps by formatting them differently or pasting them into separate fields (e.g. my Summary field vs. Comment field vs. Transcript field). I could also search the Comments Summary PDFs in addition to (or instead of) the original PDFs when searching for conceptual topics. Or maybe I’d split my comments from the quotes into separate PDFs files. More to think about.
- If there is an Index in the PDF (and I hope there always will be), export that as text and put it in Access as well, probably even before taking notes on the book. I’ll need to think more about how this will work on the database side – should the Index info eventually be converted into keyword fields?
- If there is a Table of Contents in the PDF, I’ll add those chapters to the ToC field in the Secondary (i.e. bibliographic) section of my database (which is linked to the Notes section).
- Unless an abstract is available, I’ll have to do the high-level summary myself the old-fashion way. Presumably I would want to simply type the summary into my Notes form on Access (Summary field), skimming through the Comments Summary as a guide. I’m thinking I should copy Google Book’s Commonly Used Terms word clouds into the database as well.
The above would provide an easy way to take notes, and especially to transcribe quotes by simply highlighting the text. I haven’t figured out yet the easiest way to export these annotations into my Notes database, beyond cutting and pasting – I’ll work on that.
Some other considerations:
- Acrobat offers plenty of drawing possibilities, but since the drawings aren’t text, they’re generally only useful if you’re consulting the original PDF. The point, again, is to make everything searchable without opening up each PDF. There is, however, a Preference setting to include the text inside a drawn box in the comments.
- The highlighting/underlining strategy above requires you to highlight enough text to be comprehensible on its own, since you will not be able to see the context in the separate Comments Summary PDF. So don’t highlight less than full sentences when you’re annotating quotes (if you prefer, you can simply summarize them instead with a Comment on Text annotation), nor should you merge together highlighted chunks from multiple sentences, like I sometimes do. Alternately, you can double-click on the comment in the Comments List pane to edit the comment text. This would allow you to add ellipses (…), brackets , page numbers, etc.
- A Comments Summary PDF of your notes can include the page number, but it will number the pages according to their order in the PDF, which might be different from the actual page numbers of the source. For example, if you have a PDF of a 20-page article and the article originally starts on p153 in the journal, the first page (153) will be numbered page 1 by Acrobat. If you need page numbers for quotes, you might need to add that info by double-clicking on the comment in the Comments List pane and adding it manually.
- You could develop a system where the type of comment (highlight, underline, maybe even color…) would indicate the type of text, although this would primarily be useful if you use the notes in the PDF, or you could sort your Comments Summary to ease transferring the annotations into the database. One possible system: use highlight for author’s thesis, underline for reasons under a specific claim, etc. If you want more categories than three (Author, Subject, Type), you could also use different types of annotations to organize your summary more generally. You can also sort your Comments Summary report by the Author, Page, Date and Type fields, which means, for example, if you assigned each country or person mentioned in a source to a unique type of comment, you could make all of the source’s discussion of country X appear together, then country Y, regardless of which pages they appeared on. Or you could select individual comments (in the Comments List pane) that you want to categorize and change their Author and Subject fields to whatever tag you want. Note that the Subject field apparently isn’t a sortable field, although it does display for each comment in the Comments Summary PDF. So maybe you could use that to indicate where exactly you should paste it into your database.
In short, you can use the following ‘attributes’ in Adobe Acrobat to take notes:
- Four types of annotations: Comment on Notes, Sticky Note, Highlight, Underline.
- Less powerfully, you can also use different colors for highlighting and underlining. I think this would only be of use within the PDF.
- Two fields within the comments (via the comment Properties): Author, Subject.
- You can also assign different icons to the comment display icon if you wish. Again, only of use within the PDF.
- If you’re really cutesy, you can include other icons (‘stamps’). Not sure about the point of this.
- You can always draw freehand on the PDF, but that won’t necessarily be searchable.
Whether you want to use all these features will depend on how much you want to consult the document in PDF form, vs. in a database.
So my overall note-taking system might look something like this combination of PDFs and Access:
|Primary Source||Secondary Source|
|Keywords||Multiple keyword fields in Notes database (Access).||Multiple keyword fields in Notes database (Access).|
|Summary||Summary field in Notes database (Access).||Type into Summary field in Notes database (Access).|
|Paraphrase||Paraphrase field in Notes database (Access).||Copy all of Comments Summary text (Acrobat) into Paraphrase field of Notes database (Access). Sort Comments Summary as needed if using different categories.|
|Quote||Quote checkbox in Notes database (Access) and bold part of letter to be quoted in Notes field. Eventually parse quote into separate Notes_sub record.||Copy quotes (highlighted text) from Comments Summary (Acrobat) into Notes field in Notes database (Access).|
|Transcription||Notes field in Notes database (Access).||PDF (full-text). Read and annotate in Adobe Acrobat.|
Making the system portable means, for me, reading and annotating PDFs on iPad. I’m a mixed-breed (iPad + Windows desktop), so if you are a Mac cultist, there are plenty of blogs out there with details on Mac-iPad synergy. Generally, I do my serious work on my desktop machine, so for me the iPad primarily serves as a more comfortable and portable PDF reading device (among a billion other things).
There are several options (GoodReader, PDF Export…), but I think I’ll probably use iAnnotate PDF. It’s main advantage that I see thus far is that it only requires 3 steps to highlight text (a hold-tap, tap Highlight, then drag to select text), and then you can email/export the annotated text and notes – works even better with a stylus. GoodReader, which I use a lot, requires 6 steps to do the same highlight and copy-highlighted-text maneuver, and you have to drag those God-awful blue bars to select more than one word. I hate those. Speed of annotation is key for me here, otherwise I just won’t use it. So here’s what it looks like on the iPad:
You can view the highlighted text within iAnnotate PDF (like Acrobat’s Comments List pane, but on the left side), export those annotations via email, and save the PDF and transfer via Dropbox. Oddly, opening a PDF annotated by iAnnotate PDF in GoodReader (on the iPad) allows you to see the highlighted text in the comments, but if you open it in Acrobat on the desktop, the highlighting is there, but the annotations (i.e. Adobe’s Comments in the Comments List pane) don’t include the text highlighted. They do, however, include any separate notes (think sticky note) you’ve created in iAnnotate. This isn’t a major concern as long as you email the annotations out of iAnnotate, but if you need all of your PDFs to have the highlighted text in the comments and want to switch back and forth between the iPad and your computer, you’ll probably need to figure something else out. That’s one reason why I’d rather not rely on the PDFs once I’ve extracted the annotations.
It will obviously take forever to implement this system across all of my secondary sources, so I’ll try to slowly implement it piecemeal as I go along. But with classes starting up next week, I’m going to first try this PDF annotation thing with my course readings, to hopefully work out the bugs and reinforce the habit.
A higher level would be to auto-extract the various ‘key terms’ from the full-text of the documents. A few applications do this (e.g. Qiqqa), but I’m not yet sure what that would require in my setup. More fundamentally, I’m not sure how useful it would be. As with any full-text processing, it would necessarily be limited by the vocabulary used by the authors. Such automated procedures are perfectly tailored for the sciences, who appear to be really good at using technical jargon precisely and consistently, who rely heavily on born-digital articles that come pre-tagged with metadata (unlike self-scanned book chapters common in EMEMH), and whose titles are always quite specific as to their content. History, not so much.
To boil it all down to a nice summary note: condense the author’s ideas into your own summaries (using PDFs as the starting point), export those comments as text, and consolidate them into a single repository. Summarize and keyword to taste.
How do you do it?