Devonthink revisited

A reader asked for a bit more detail about how I use Devonthink, and after several months of use, I have a more straightforward description. Straightforward, however, does not mean short. I don’t know the meaning of the word.

The way I use DTPO, there is no significant difference assigning tags and groups. Groups and tags are two separate hierarchical lists in the left-hand pane that you can drag documents to, but other than that, they are almost identical. You can assign documents to either a tag or a group by simply dragging them onto the appropriate icon. If the target is a group, dragging is essentially using the Move To command, which will remove the document from its current group – that may be the problem some people have in assigning a single document to multiple groups. Of course if you’re moving it from the Inbox that’s not really a problem since you probably want the Inbox empty anyway.

In any case, there are easy ways to create duplicates/replicates, which is what you need in order to put a document in multiple groups.

1. For those mouse-draggers, you can Opt-Cmd-Drag a document to another group and it will create a replicate there. Option-drag creates a copy (duplicate) in a new group. Oddly, however, I’ve noticed that on the Magic Trackpad (my pointer of choice) the Opt-Cmd three-fingered drag motion MOVES the document rather than Replicate it. If you use the more traditional click single-finger drag combination on the trackpad, however, it works. Whatever.

2. Using the Magic Hat drawer is a bit trickier, since the Move To is just that. So you can Move To a specific group (or shift-select multiple groups) and then use method 1 or 3 or 4 to send to groups that aren’t listed in the Magic Hat list. (I think there may be Applescript somewhere to automate this more by creating new groups as well.)

3. But I try to be keyboard-focused, so when working with a specific document, I assign the groups by jumping to the Tag bar of the document (there’s a key shortcut) and just typing in the tag or group, as many of each as I want. This is particularly helpful if you have nested hierarchies and don’t remember the parent group name (to navigate to in the list) but do remember the specific subgroup, e.g. I remember the subgroup ‘England-Netherlands’ even if I don’t remember that it’s in the Diplomacy-Alliances parent groups. The popup in the Tag bar will recommend to autocomplete a tag/group I’ve already created. This way, having the same document in multiple groups is automated if you type multiple groups into the Tag bar of an individual document – you don’t have to do anything else. It will automatically create replicants as needed. All this assumes you’ve unchecked the Exclude Group from Tagging option (right-click the database name in the left-most pane, choose Database-Properties and uncheck).

4. Or you can manually replicate/duplicate a specific document (key commands) and then drag the copy to another group (or right-click and do the same with the pop-up menu). If you are concerned about database size, use a replicate, which will create multiple instances of a single copy (red title if you have that setting checked). If you will be further modifying the document depending on which group it belongs to, choose duplicate (title in blue), which will keep the original separate from any copies.

So my basic workflow consists of creating provenance tags and then moving the original PDF/rtf documents directly to the provenance tag, then taking notes (‘one thought, one note’) in separate rtf documents (within that same provenance tag) with a wikilink back to those originals. Then I assign topical groups to those note rtf’s in the note document Tag bar, which keeps a copy with the original in the provenance tag and creates a replicate of the note document in all the topic groups I’ve typed. If you need more detail:

1. Drag copy of an original source PDF/rtf into DTPO – to the Inbox if I don’t have a provenance tag already created, or directly into the provenance tag if it already exists. Of course I could just create a blank provenance tag and then drag the PDF there if I have the time. Generally I try to do so, so I avoid step 2.
2. If the PDF went to the Inbox, I then create a blank provenance tag and drag the PDF there. But ideally you skip the Inbox, because that is actually its own group, i.e. you need to assign a document to another group to get it out of the Inbox. AFTER sending the document to a Tag, you still have the ‘Inbox’ group assigned (if you’d moved it to a group it would have replaced the ‘Inbox’ group with the new group). For those instances, I created a sub-group under Inbox called “Untagged”, drag the PDF from the Inbox into it, then delete the ‘Untagged’ gray tag in the document’s Tag bar (you can’t see the Inbox gray tag in the Tag bar of documents in the Inbox, but the Untagged group tag essentially replaces the invisible Inbox tag with a visible Untagged tag when you move it to Untagged). This removes the document from the Inbox so it is currently only in a provenance Tag. (Now you see why I prefer to create the provenance tags first and skip the Inbox.)
3. Once the original PDF is in the provenance tag, I can then assign the entire document to any number of groups via the Tag bar. But since I have lots of image PDFs and long treatises in full-text that cover a variety of topics, I rarely assign an entire 17C manual in PDF to a specific topical group. Instead I usually excerpt parts (or type up notes/thoughts) in a separate rtf document within the provenance tag, and link (Copy Page Link) to the original PDF, to the specific PDF page even. Then I can assign one or more groups to that note document, that may be different from other notes on the same source.

Using groups for provenance will work, but you lose DT’s ability to auto-suggest based off of the groupings – unless of course all of your groups (400 folios of archive volume 43254 for example) relate to one specific topic.

That being said, I do use Tags for other purposes – in part because DT isn’t a relational database that would allow me, for example, to create a lookup table so I could identify person Nottingham as a Tory… But that’s a function of the limited metadata available in DT. Given the granularity at which I want to tag/group my data, I have to use groups and tags in several different ways.

BROADER DISAGREEMENT?

But there might be broader issues of divergence here. I’m not sure if the group/tag difference comes from people wanting to put an entire original document into a single topic (whether group or tag). In general I don’t think that’s a very good idea because you are most likely analyzing subsets of that document. Perhaps I’m unique, but I work with a level of granularity that divides a single one-page letter up into several parts – one note for the first sentence talking about receiving news about the siege of Turin, then the next paragraph talking about politics back in England, then a final postscript about the shortage of fodder. Or a treatise on the art of war that spends a chapter on the infantry, then one on the horse, then one on siegecraft (with subtopics within each chapter). If the goal is ‘one thought, one note’, which it should be, ideally you shouldn’t have very many note documents with multiple topics – if you do, your search results won’t be very granular. Although that depends on your vocabulary: I distinguish a topic from an author from a place from a level of war from a branch of service from a date… Unfortunately DT’s limited metadata requires you to condense these thirty different pieces of metadata into just a handful of dimensions (groups, tags, names, limited metadata). Say I have a single sentence on a siege (EventID=Douai1710), that talks about the food supply (Level=Logistics) of the Allied besiegers (Army=Besiegers): it’s hard for DT to make room for all of those (in addition to all the provenance info) without having either the group or tags do double-duty. So I have a group for the siege of Douai 1710, another for Logistics, another for… Presumably just what people do with tags, but with groups. I’d still really like some mass-editable custom metadata fields in DT – that would truly make it a killer app for historical research.

Another possible divergence I’ve noticed comes from how much you want to use the AI. Yes, you can assign multiple tags to a document, but you can’t use the AI on those tags. If you have a limited number of documents so that you can fully process all of them, and your analysis will be relatively focused in time and scope, this may be fine. But that’s not my situation. You can use the AI’s See Also on any specific document without grouping by topic, which will definitely help you find other documents with similar content.

But I’m not sure how robust these results will be. For example, will See Also connect my single document consisting of a paragraph on English politics to other discussions of politics outside of England? I don’t know – it would be great if someone tested it empirically. But given that it looks for unique-ish words, I’d think proper nouns tied to England in that paragraph (London, Parliament, Godolphin…) would outweigh other political-y words that might be shared by more general discussions of politics (elect, secretary, vote, debate). If you create separate groups based on politics regardless of country, in theory the AI should more easily see the patterns among these shared politic-y words. And what if you want to look for other documents that talk about Parliament (without mentioning its name), or want to do a separate search on documents that talk about Godolphin? I’m not sure that you can force a simple See Also search to ignore some of the content of the document, whereas I think you can at least demote its importance by grouping it with other documents that talk about the topic you want to focus on, e.g. documents that use phrases like Lord Treasurer, the Queen’s servant and Whitehall when you’re looking for Godolphin documents. The Devonthinkers don’t want to divulge how exactly the AI works, so we users are in the dark until somebody does some real tests. My assumption has been that Classify is a more powerful and more flexible use of the AI since you can create an infinite number of forced semantic groupings of your own choice.

I don’t have all of my documents (24,000 after all) fully grouped and parsed, and only 8,000 of them are text, but the results from the screenshot example below gives me a bit more faith in the Classify by groups than the See Also by individual documents.

Classify vs See Also

Classify vs See Also

The selected document’s content is in the second-from-right pane. The Tag bar at the bottom shows the specific group I had assigned it to on my own – Fear of Casualties – subset of Laws groups. The right-most top column shows the AI’s Classify recommendations. Ideally you would explore each of these groups and their documents in turn, but my initial reaction is that the AvoidCasualties and Casualties groups are spot on – maybe I even need to consider combining the Fear of Casualties and AvoidCasualties groups for even more robust results.

More interesting are some of the other suggestions. I’m not really sure about French criminal – maybe ‘bloody’ and ‘killing’ prompted the connection? The Defense of Great Captains is interesting – possibly the prescriptive tone is being noticed by the AI? The Deception group could be focusing on the ‘least bloody’ language. Again, speculation, but the AI has suggested some interested thematic suggestions that I hadn’t initially thought of. Even if the AI is wrong (i.e. its similarities aren’t based on what I thought), my speculation might itself merit further exploration, or at least serve as brainstorming. So maybe if I create a group with all the prescriptive literature combined together, what other connections will I find? And so on.

On the other hand, I’m not as impressed with the See Also (bottom of right-most pane). A few individual documents seem quite appropriate (and probably help suggest why French criminality and Deception were suggested groups). There are some definite suggestions to explore, but they seem rather limited, or at least they don’t immediately suggest the conceptual connections automatically suggested by Classify – and any topical tags won’t appear in Classify. That alone makes Classify and topical groups worth it in my eyes. Further, there were a number of documents in the suggested groups that do not appear in the See Also list. Not sure what this means, but it might be worth exploring as well. The experiment continues.

[Bonus content! In the screenshot above, my wikilink back to the original is the blue hyperlink ### at the top. I use ### because I found that if you include citation info in the document text, e.g. Dryden Fables ancient and modern, that will interfere with the AI. It will link all the notes on Dryden together because they all share the text ‘Dryden Fables ancient and modern’ – thanks for telling me the obvious! But since the provenance tag (blue Dryden tag in the Tag bar, lower right) already indicates its provenance, that’s wasting the AI. You want the AI to base the connections off of the content of that document, not off of the citation text, and to have the AI suggest documents that aren’t obvious. Since DT doesn’t interpret non-alphanumeric characters, it doesn’t include ### in its analysis. It’s a bit gimmicky, but it seems to work.]

Tags: , ,

One response to “Devonthink revisited”

  1. jostwald says :

    Minor edit: I just realized that one of the bundled Applescripts – Tags-Remove tags from Selection – will allow you to delete the same tag(s) from multiple documents in the Untagged group, even if the docs don’t have all the same tags. Select the docs, run the script and type in “Untagged”. So that saves a little bit of time at least.

Leave a comment