How I Capture Reading Notes in Obsidian

In addition to automating my daily notes with Obsidian, it quickly became clear to me that Obsidian‘s note-linking capabilities would allow me to capture my reading notes in Obsidian in a really useful way. Moreover, because of Obsidian’s powerful linking capability, it occurred to me that my Obsidian vault could serve as a database for my reading. To describe how I managed to do this (so far) in a step-by-step manner will required a little history first.

A Brief History of My Reading List

I began keeping a list of every book I finished reading back on January 1, 1996. Although I am no longer certain of why I started keeping the list (was it part of a New Year’s resolution?) I am fairly certain that I was influenced by an early reading list I found on the Internet, Eric W. Leuliette’s “What I Have Read Since 1974“.

As a developer (even back then), I decided I would build an elaborate relational database to store my reading list. Over the years, it went through many iterations, and forms. When time became short, I moved the list out of the database and into Excel, or Google Sheets. Finally, several years ago, I settled on a plain text file using Markdown format, and that is how I’ve kept my list ever since.

But I’ve been bothered by shortcomings on this list. There are redundancies I don’t like about it. I have no easy way of referring to books or authors separate from the list. There are things I’d like to automate about it but that the format makes tricky.

A Brief History of My Reading Notes

With all of the reading I do, I have trouble remembering important details of what I read about. So I started keeping notes on my reading. This evolved out of how I kept notes on my reading back in college, and has continued to evolve over the decades since. It was in college that I first decided it was okay for me to write in my books. After all, if I was spending so much money on them, I might as well make them my own, right?

These days, I highlight books, writing margins, and with e-books, I highlight and make short notes on my Kindle devices and apps. But I still have no good way of aggregating these notes into useful groups, categories, and certainly no way of readily searching them.

As I started using Obsidian, and began to see how I could better organize my books and reading lists in its vault structure, I began to get a hint of ways that I might start to link my reading notes back to the books they are associated with, my reading, and other notes.

Enter Zettelkasten

I’d never heard of Zettelkasten before I started using Obsidian. Zettelkasten was originally invented as a way to link paper notes together to be able to easily create connections (links) between then. While it was workable on paper, such a process could be greatly improved with hypertext tools, and it so happens that Obsidian’s note-linking capability is idea for this.

One important idea from Zettelkasten is that a note should contain a single thought or piece of information (say, a passage highlighted in a book). That note is given a unique identifier. In addition to the passage, one would add their own thoughts to the note, and perhaps further link that note to other notes and ideas that are related to it. Zettelkasten has its own unique numbering system for “naming” the notes. Obsidian has a plug-in for creating a “Zettelkasten number” for this purpose that is based on the date/time the note is created. I wasn’t particularly fond of that identifier because it duplicates information already contained in the note itself. After all, the note is just a file in the file system, and has its own create and modified date/times as part of the file. A good identifier does’t embed real data. It’s just an identifier.

I also struggled a bit to figure out how this would work for my reading notes. I originally imagined that if I had a note for each book I read, I could simply add my highlights and annotations to that note. Zettelkasten, however, suggested that rather than adding that highlight to the book note, I’d create a separate note for just the highlight or annotation, and then link it to the book note–as well any other notes it might make sense to link it to. This took a while for me to process, and I thought about it a lot as I built out my reading library in Obsidian.

My Obsidian Library

So how did I decided to structure my reading notes in Obsidian? I’ll try to go through the step-by-step process I have for putting this all together, in case someone is interested in reproducing this.

Step 1: Establishing the structure

I decided that because of Obsidian’s great linking capability, I could use the file system itself as a relational database. In deciding this, I further decided that there were 3 main “objects” I wanted to be able to capture at a kind of atomic level. That is, three things that make up the structure of my reading library:

  1. Things I read, e.g., books, articles, stories, etc.
  2. Authors: the people who write the things in #1.
  3. My notes as they relate to #1 and #2.

From this, I established the following structure of folders in within my Obsidian vault:

My folder structure for Reading notes in Obsidian.
  • Commonplace Book contains all of my reading notes.
  • Library contains all of the “atomic” notes that make up my reading library:
    • Authors: a single note for each unique author in my library
    • Articles: a single note for each unique article (often not tied to a book) in my library.
    • Book: a single note for each unique book in my library
    • Essays: a single note for each unique essay in my library; these are often related to books.
    • Stories: a single note for each unique story in my library.

Step 2: Deciding what goes into a note

Once I had my structure, I had to decide what goes into a note of each type. What is it I want to know about authors, books, stories, etc.? This was fairly easy for me as I’ve been thinking about it for a long time (years, actually). I had in mind an idea that I could write an API that uses these files as a database to query them and produce results. With that in mind, I decided to start by keeping things simple, knowing that I could add detail as needed going forward.

For authors, I wanted just some basic information. Here is a typical author note, in this case, for Alan Lightman, whose new book I read earlier this week:

A sample author note for Alan Lightman.

The backlinks section is generated automatically by a script that I have that runs nightly. I know that I could just click on the “Linked mentions” in Obsidian to see all of the backlinks, but I wanted the related books on the note as a reference in case I access the file outside of Obsidian.

For books (or essays, stories, articles), I also kept things simple. A typical book (or essay, or article, or story) looks like this:

A sample title note for In Praise of Wasting Time

Note that in both authors and books there are links back and forth between the files. The book file refers to the author. The author file has link references back to the books. Moreover, you’ll note that in the book, there is an “Annotations” section with a list of links. These are auto-generated links to my notes and highlights for the book. I’ll have more to say on these shortly, but the important thing is that each note and highlight is a separate file (in the Zettelkasten vain) and is included with the book as a “transclusion” link, meaning that when I view the note in preview mode, it “includes” the links files as part of the note, like this:

Title note in preview mode with transcluded annotations visible.

Step 3. Populating the database

Once I had the structure I wanted, I needed to populate my database. I was fortunate in this regard on 2 counts: (1) I happened to recently create a SQLite database of my books, and (2) I can write code relatively easily. I wrote a script that crawled my book database, and from it, creating the notes for books and authors in Obsidian. This turned out to be a surprisingly simple exercise. (The Python script was 130 lines.)

My digital commonplace book

I first learned of commonplace books reading a biography of Thomas Jefferson (in this case, it was Williard Sterne Randall’s Thomas Jefferson: A Life.) Jefferson (and others in his time) would copy passages from their reading into a book. This helped with memorization, but it also provided a resource where they could add notes and observations. I’ve always liked this concept, and I decided that Obsidian would finally allow me to put it into action in a way I’d envisioned.

It is trivial to create a note and add it to the note containing the book to which it is related. But what if the note ultimately relates to more than one thing? Reading about Zettelkaten provided me with insights into how I might handle this. The naming convention in Zettelkasten (and the way it is implemented in Obsidian) bothered me. Neither made much sense. How do you search for things with essentially coded filenames?

I was in the shower when I finally had a breakthrough insight on this. I’m not searching for a filename, I’m searching for file content. If each annotation and highlight I can link it to as many notes as makes sense. Furthermore, I can add tags to each note. The name of the file doesn’t matter. What matter is how it links to other notes, and that all files are searchable.

I still didn’t like the file-naming scheme for Zettelkasten in Obsidian, which essentially uses a datetime stamp down to the current second. So a file might be named: 20210215084456. Given that one is not likely to create two of these notes within the same second, it guarantees uniqueness. But from a database perspective, identifiers like these are not supposed to embed any information. They should be strictly identifiers. Moreover, the with the date embedded in the note title, I would be duplicating information that already exists in the file properties.

I decided instead to use a Guid, or what is sometimes called a UUID. This is another form of a unique identifier that doesn’t embed information, just produces a unique code. (For those tech-savvy folks reading this, I used Python’s UUID4 which doesn’t use the MAC address as part of the identifier.)

When I have a new note or highlight for a book, it goes into my Commonplace Book folder in Obsidian. These notes also have a specific structure. A typical one looks like this:

A typical note, Zettelkasten style.

Each annotation begins with a Source that links back to the source for that annotation. It may or may not have tags associated with it. That is followed by the body of the annotation, which may be a highlighted passage. Finally, there are my own notes related to the specific passage. In the above example, my notes also link to another book, making this particular annotation related to more than one note. That is, a link has been created between Creativity, Inc by Ed Catmull and Amy Wallace, and On Writing by Stephen King.

Automating my annotations

Over the weekend, I got a start on automating these annotations. I wrote a Python script that reads a CSV files exported from Kindle, and creates a unique note for each annotation in the file, relating it back to the source book in my Library. My process is roughly this (I say roughly because this is still new):

  1. When I finish reading a book, I export the annotations from my Kindle, which sends me an email. That email has a CSV attachment which I save in a folder.
  2. A script runs, and processes and CSV files I have in the folder, creating the notes and links.
  3. The script, outputs a list resulting annotations for each file. I copy this and paste it into the “Annotations” section of the source book or article. That makes it easy to view the annotations inline when previewing the note. An example of the output from the script looks like this:
Output from my annotation import script.

Toward an API for my books and annotations

I am able to do the above automation because I have a standardized structure to my books and author notes. That standardization allowed me to write an API for my book library. From this API I can, for instance, check to see if a title exists in my library already. I can grab information about a book or author and then use it in some way. The API typically returns data in JSON format. For instance, if I call the function biblio.search_by_title("Beyond"), I get a JSON formatted return containing the following:

[
   {
      "title":"_Beyond Band of Brothers: The War Memoirs of Major Dick Winters_",
      "link":"[[Beyond Band of Brothers (334)]]",
      "type":"book",
      "authors":[
         {
            "author":"Winters, Richard",
            "authorFirstLast":"Richard Winters",
            "authorLink":"[[Winters, Richard]]",
            "gender":"None"
         }
      ],
      "source":"",
      "date":""
   },
   {
      "title":"_Beyond Apollo_",
      "link":"[[Beyond Apollo (58)]]",
      "type":"book",
      "authors":[
         {
            "author":"Malzberg, Barry N",
            "authorFirstLast":"Barry N Malzberg",
            "authorLink":"[[Malzberg, Barry N]]",
            "gender":"m",
            "alternateNames":[
               {
                  "name":"Barry, Mike",
                  "nameLink":"[[Barry, Mike]]"
               }
            ]
         }
      ],
      "source":"",
      "date":""
   },
   {
      "title":"_Beyond the Blue Event Horizon_",
      "link":"[[Beyond the Blue Event Horizon (259)]]",
      "type":"book",
      "authors":[
         {
            "author":"Pohl, Frederik",
            "authorFirstLast":"Frederik Pohl",
            "authorLink":"[[Pohl, Frederik]]",
            "gender":"None"
         }
      ],
      "source":"",
      "date":""
   }
]

The results so far

I’ve linked all of this together using my master reading list note. This note contains a list of everything I have read since 1996 and serves as a kind of index to my reading:

A sample from my reading list index note.

A big part of the way Obsidian works is that it can show you the relationships between your notes. While I am still working on importing all of the notes I have in my Kindle, I can already see a a network of relationships when I view the graph of my Obsidian vault:

A graph of the relationships between all of my notes.

Most of my notes are book and reading-related at this point. That big dot in the center is the master reading list illustrated above. If I highlight it, this is what I see:

Sample of a highlighted node on the graph.

From there, you can see other nodes and relationships that have started to form. For instance, if I hover over one of the Alan Lightman books I finished yesterday, In Praise of Wasting Time, you can see a little network of links coming off that book:

Some of those links point to annotation files. Another points back to the note for Alan Lightman. And a few of the annotation links point to seemingly unrelated notes.

Here is another example. One of the big nodes is for John W. Campbell, editor of Astounding Science Fiction in the late 1930s through his head in the early 1970s. I read many of those old issue when I was taking my Vacation in the Golden Age of Science Fiction. So Campbell shows up a lot on my master reading list:

Highlighting an author node on the graph.

You can see that Campbell is linked to all of the issues of Astounding that I have read. I have started to bring my notes in for those issues. If we look at the July 1939 issue, for instance, you can see this is related to all of the stories and articles and authors in that issue:

Currently, the notes for each story are part of the story note, but I plan on breaking those out into their own Zettelkasten-style notes as I’ve done for my other notes and annotations.

Conclusions

Keep in mind, that this is all being done with plain text files, something that I like because the format is compatible virtually anywhere. This could be done as easily on a Windows machine as a Mac. It could be done easily on a Linux machine. The openness and longevity of plain text (which has been around for fifty years now) is a big part of what I like about this system.

The linking that Obsidian provides from within its application makes all of this useful. But once established, those links are just as useful outside Obsidian with a little coding–as I’ve done with my API for books and authors. And this API is extensible. This week, I plan to add capability for the API to return any annotations when returning a “book” object. So in addition to what is returned by the JSON format illustrated above, that will soon contain a node for annotations related to that book.

Mostly, I am satisfied that I now have a simple way of keeping my reading notes in a useful form. These are easily searchable, they are easily linked. I can continue to capture highlights and brief notes as a I read. The import function allows a nice step to expand on my annotations as I review them after they’ve been pulled into Obsidian.

It did take me some time to get the infrastructure in place, but now that it is there, I am able to focus on reading, notes, and let the system organize them for me.

8 comments

  1. I really like so much of what you’re doing here. One question I have is whether you find the GUI reference model a little limiting? I find it useful to reference a quote using a wiki link, and that can be tricky if the link identifier is a randomised GUID.

    For example, I captured a quote by Katie Mack from their book last night, and referenced it using a regular reference to the quote that opens a new note with the actual quote, like this: https://d.pr/i/NaRgSr/3XuZQEc9Uc

    1. Paul, I get what you are saying and I’ve struggled a bit with this. A couple things have helped me. First is, as you indicated in your follow-up comment, that it is the content and context that is important, not the filename. My problem was (a) I’d have to think of a filename for everything I highlighted, which in some books, could be a lot and be time-consuming. Because I automatically link these back to the source, it creates a binding that makes it easier to find; (b) what happens when I have more than one quote about being at the center of one’s own universe? Zettelkasten method has a way for handling this, but again, it requires (i) thinking about naming and (ii) remembering that I had more than one quote.

      Second, these are text files and Obsidian (to say nothing of filesystem commands) make these easily searchable. So I should be able to find what I am looking for with a search.

      Third, in your example, I can do something similar when clarity is required. The difference would be that I’d add a text reference to my link, like this:

      Here’s a [[4F8E0531-33DB-4327-BDE8-D2414713C42D|reference to a quote from Hamilton]]

      which, when viewed in preview mode, will hide the link Guid and all you’ll see is “Here’s a reference to a Hamilton quote” with the bold part linked to the quote file.

      Finally, I’ve often been adding a topic heading in the quote file itself for additional context. I don’t do this for ever highlight, only those that I think it will help in finding it. But at least in this case, there isn’t a problem (i.e. filename conflict) if I give two separate files that same heading within the text.

      I still need enough of these annotations in my system before I can be sure that I can quickly search and find what I am looking for (I’ve only imported a few hundred at this point), but it feels like it will work.

  2. Just as a follow-up, I see what you’re saying about the importance of the content, and not so much the filename.

    I was in the shower when I finally had a breakthrough insight on this. I’m not searching for a filename, I’m searching for file content. If each annotation and highlight I can link it to as many notes as makes sense. Furthermore, I can add tags to each note. The name of the file doesn’t matter. What matter is how it links to other notes, and that all files are searchable.

    Do you find it relatively easy to find notes just based on content searches?

  3. Very impressively thought out! I’ve been playing with something similar but you’ve solved a lot of problems. Any chance of a git to jump start and modify for my workflow?

  4. Got it. Thanks. I understand what you mean about cleaning up code. My standards on personal code vs what I do at work are very very different. Looking forward to seeing some of the snippets some day.

  5. Hey Jamie,

    Thanks so much for outlining this whole process! I just started using Obsidian – I was able to create my own Reading library using your process (minus the automation / Python script stuff) and I’m super excited about it!

    I have a formatting question if you don’t mind….

    I noticed the UUID file names of your annotation notes are hidden when you include them as transclusion links in the book note. How are you able to do this? I looks a lot cleaner without the file name showing and I love it! Is this something I could style directly in Obsidian rather than as part of the Python script you described? I’m not using any code with my library – I’m entering book and annotation notes in Obsidian manually. So I’m not using UUID identifiers, but I still don’t want the annotation file names to appear when linked in my book notes (which it seems like they do by default?)

    Also, I’m curious about how you use and decide on tags – it seems like you have been tagging by topic for both the annotations and the book notes. Any specific reason for this? Have you found this useful? I think I’d like to do something similar and may create an Index with only a limited number of big-picture topics to start but haven’t decided yet.

    Thanks in advance!

    1. Sarina,

      In the book note, the UUID is hidden because (a) the notes are transcluded and (b) because in preview mode, transclusion shows the full note instead of just the link. I’m guessing it was preview mode that you were thinking of. The raw note always has the UUID. If you look at the note in preview mode, you should see the transcluded note and not the title.

      As far as the tagging goes, I haven’t worked out a good methodology yet. I’m still playing around to figure out what works best for me. When using Evernote, I tend to use tags to fill in the gaps for questions like who, what, when, where. “Who” and “What” are the two questions I find tags most useful for. Who is this in reference to? What is this about?

Leave a Reply to Jamie Todd Rubin Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.