How do you do personal knowledge management?

This post is a bit of a brain dump / working-out-loud thing. The posts here on jcarroll.xyz are less formal than what I put on my main blog jcarroll.com.au which is more for technical articles on coding. There won’t be any code in this post. This was built up from a series of notes captured over a couple of weeks or so, which makes for a somewhat longer, bordering on rambling post, so … sorry, I guess?

I’ll also point out from the get-go that I’m not particularly looking for tool suggestions in this post - I’ve committed to using Obsidian as my note-capturing tool for at least the next couple of years to give it a fair chance.

I just finished reading Building a Second Brain by Tiago Forte and even early into that book I was convinced that I wanted to start better capturing my notes when I’m reading something. That’s not to say it was the only resource I’ve read on the topic recently - I haven’t quite finished How to Read a Book (the first edition of which was published in the 1960s, and the second edition - the one I’m reading - in the 1970s) which covers the concept of progressive summarisation in great detail, and was the actual first prompt for me to start taking notes, but I only did it while reading that book.

The idea of progressive summarisation is to distil the essential message of a given paragraph/chapter/article (at some scale) so that if you do ever need to refer to it again you can read that rather than re-reading literally everything in the book. As someone who does a fair amount of knowledge work, that can be important. As a scientist I was trained to use a notebook frequently, but in any process I tried to implement myself I created too much friction to use it regularly in my non-academic life.

I began by taking a lot of notes in paper notebooks - I enjoy the experience of writing and did find that on re-reading what I had written I found a lot of things that I had otherwise forgotten about, even just a week later. Paper is nice, my fountain pens are lovely to write with, and a notebook makes for a targeted place to store those notes, but it wasn’t searchable or necessarily legible all the time. “What good are notes you never read” comes up a few times around the place. I needed a digital solution both for storage and for searching/organising the notes.

As it turns out, AI LLMs are pretty good at extracting text from my a photo of my handwriting - not perfect, but way better than what I remember of using a Palm Pilot with a very specific (and temperamental) handwriting recognition system. So, I could “scan” all my handwritten notes and… file them? I’d still need to text search through them every time I wanted to look up something.

Enter Obsidian - a free (but not open-source) digital note organisation tool. It’s probably familiar to users of Evernote, Notion, etc… but extremely configurable; stores everything in plaintext (mostly markdown) locally; and works on pretty much every platform. Even better, they’ve very recently dropped the price of their sync offering to US$4/month which seems very reasonable given how well it works. I briefly flirted with the idea of the other free sync options (put it all in git, gDrive, etc…) but getting these to work properly across multiple devices including a phone didn’t seem like a saving over the very reasonable price of sync. I’d used Obsidian before for notes, but just for capturing something daily, which I rarely ever referred back to.

Building a Second Brain is agnostic to the tool being used, so I figured I’d do what a lot of other people on the internet appear to have done and implement that system in Obsidian. I’m learning a lot as I go and trying not to over-complicate things, but I’m interested in hearing about how other people do this.

One piece I’m still trying to balance is the physical act of notetaking on paper and transferring to a digital tool. My understanding is that there’s a nontrivial difference in retention between writing a note and just highlighting something. My notes on Building a Second Brain were physical, while I bought the eBook for the follow-up The PARA Method and I’ll do highlighting on my Kindle Scribe which I can export to text. I can also physically write on the Scribe and export that converted text (and it still does a pretty good job of the conversion of my handwriting). I suppose we’ll see if these notes feel better captured.

To set up some context about who I am and why I want to take notes at all: I’m not an academic any more but I still do research, both for work and fun. I’m a PhD theoretical physicist who transitioned to software / data engineering in cancer immunology and now autoimmune diseases doing statistics, software development, data curation/parsing/transformation/storage, visualisation, infrastructure, API wrappers and more, primarily using R. I work on a handful of open-source projects including {ggeasy}. I’m learning as many programming languages as I can and enjoy polyglot challenges. I’m interested in personal knowledge management and knowledge graphs (and the intersection of these).

« self-promotion warning »

For what it’s worth, if that’s the profile of someone you’re looking for in your organisation, I occasionally have some short-term capacity to take on projects. I can’t recall on which podcast I heard it discussed (because I wasn’t taking notes… grrr) but there was a point about the fact that orgs don’t hire physicists for their physics knowledge; they hire them because getting a PhD in physics is a proxy for “I can do this work”. I connect up data sources via APIs and create tools to help scientists interrogate their data faster and more robustly. If you’re interested, feel free to get in touch with me.

So, with that context in mind… “how do you use personal knowledge management?” Not which tools do you use - I’m less interested in how you ingest notes and more interested in how you extract knowledge from them. We don’t insert knowledge into these tools; we use them to offload the heavy lift of storing information so that we can link things up and retrieve them on demand.

The notes I took for Building a Second Brain were a bit meta - but there were tidbits in there I probably want to refer back to at some point. I work on a set of projects for different clients and need to keep track of certain details, ideas, snippets, and notes in general for those, so a notetaking tool of some sort is essential.

I started expanding my notetaking to the rest of my life and created notes for a project I’m currently in the middle of - building some more storage for my workshop/shed. Then I realised that there were a lot of other ‘areas’ (not ‘projects’) for which I could be capturing information. I started capturing all the IT infrastructure in my house and realised how scattered all the information actually is. I remembered this post about ‘documenting your house’ and managed to dig it out of Pocket - my current ‘read later’ tool. I realised I could store the product manuals for all the devices right alongside my config notes, model numbers, purchased date (year at least). It’s starting to feel organised, and that’s great. These bits aren’t linked in any way to my research, but that’s fine - I can have disconnected ‘landmasses’ of notes.

I started capturing the blog posts I found useful/interesting - Obsidian connects up to a web clipping tool that can capture the entire content of a post and insert it as markdown into a new note. I attended a Meetup and noted some things I wanted to look into more later. I now have a note for each Meetup group linking to notes for each of the people I frequently meet. I can write the link for anything and even if the note it points to doesn’t exist, if I ever create it the link will be made.

I figured I could probably also capture all the books on my bookshelf, but I really didn’t want to sit there and type out all the titles manually. Thankfully, these new fangled AIs are quite good at extracting them from (rotated) images. I had a lot of success with claude.ai (once I got the prompt working). I have a note containing a bullet list of all the books on my shelf, sorted into rough categories. I did the same with my vinyl albums. Fantastic! So organised! Now if I want to refer to those in my notes, it’s easy. Plus, I can see what I do and don’t already have.

At this point I tooted on Mastodon about how much I was enjoying Obsidian

and was pointed to a convenient plugin to pull in metadata about books/music/etc… from a database

which a) is extremely cool! b) made me stop and think about what I was actually doing with this data. Yes, I could pull in all that additional information, but would I ever use it?

One of the points in Building a Second Brain was about ‘data hoarders’ - people who note everything in case they’ll ever need it, but with no plan to get any of it out again. I’m not accusing Lou here of anything like that - a good suggestion for a plugin is most welcome - but it got me thinking about how I want to landscape my knowledge garden; do I want ornamentals (nice to look at but not productive) or fruiting plants? I’ve since seen other people with curated vaults full of movies they’ve seen/reviewed and all that metadata pulled in automatically, so it has a place, but not necessarily for what I want to get out of my own vault.

I’ve dealt with a similar delicate balance before - I’m frequently tasked with annotating data (most recently of the genomic variety) such as complementing the metadata for samples from a clinical trial with metadata regarding the source subject. Where I’ve seen this done less reliably, all of that has been shoehorned into the data of the sample itself, but in more sophisticated systems it’s a lookup from sample to subject to create a join. Not every piece of metadata is always relevant, though - it depends on what you want to do with it. Some downstream users of the data I curate might be interested in the medical history or concomitant medications of the source subject when analysing a sample, but others may only be interested in the count method used for sequencing.

I could link all sorts of metadata to the entries I’ve created - I could add the artist to my albums and add a note for them. I could add the record label and the city in which they’re headquartered, and the individual band members and the other bands they’ve been in and their albums… but where does it end, and why do I want that?

Someone recently created a (knowledge) graph of “all of Wikipedia” which is both “trivial” to do and impressive - trivial because it’s just capturing the links between pages, and impressive because it’s nearly 200 million links between 63 million (English) articles. This is interesting because it shows which pages link to each other, but not why they link to each other - one of the examples in the section on paths between articles shows that to get from ‘Pokemon’ to ‘Ancient Egypt’ it’s only one stop along the way - ‘pet’ appears in the ‘Pokemon’ article and has a link to ‘Ancient Egypt’… but that graph traversal isn’t particularly useful or meaningful.

Similarly, I’m trying to figure out which links are useful to capture, and part of that is figuring out what a ‘note’ looks like. I think I’m starting to get the idea that a note should capture an atomic ‘idea’ so that links between those are useful. Finding a link to the entire Building a Second Brain book note isn’t helpful, but finding one to a particular quote from there might be. With that in mind, do I need a note capturing the artist, year, runtime, etc… for every album I own? Am I ever going to reference that? More importantly, perhaps, can’t I just search the web for that information if I need it?

In theory, I may want to search my knowledge garden for an artist or an author, but more likely I’ll want to create links between ideas in their works, and I should capture those.

More broadly, I’m starting to capture things that I’ve found troublesome to find via a web search - getting harder and harder these days with all the LLM content, SEO-optimised trash, and even human-generated content that’s only purpose is engagement farming. One of the points that stuck with me from Building a Second Brain was that a resource such as one’s notes should “do the heavy lifting ahead of time so that the actual project is easier”. I’m still figuring out what a “project” is in my world (sometimes literally a client project, but other times a less-well-defined thing) but having notes to get the ball rolling sounds like a great idea.

Another salient point that I believe will guide me in the right direction was the idea that “the notes should form a working environment, not a storage environment” - notes should be “actionable” in the sense that you can do something with them. I think this comes back to the “ornamental vs productive fruit” garden framing.

When creating a new note, a question I’m trying to keep front-of-mind is “can I get out what I want from this via a query?” - that query could be a simple text search, a search for a specific tag, a backlink, or some more sophisticated dataview, but I do want it to be possible to get to the information, and would like to structure my notes to make them more amenable to those queries.

So far, when reading a paper/blog post, I’m copying in the entire text and bolding/highlighting as a means of progressive summarisation, but I believe I need to (later, not immediately, after reflection) turn any salient points from that source into their own notes. I think the name for the former is “fleeting notes” (which don’t need to be kept) and/or “literature notes” (direct highlights from the source) but these need to be distilled into their own notes at some point. I’m not really looking for the “right” solution here, more trying to understand what will be helpful in achieving success.

I’m definitely starting to appreciate the value of notes - all of the ideas for this post were captured in a note as short bullet points, getting as many down as possible, ignoring spelling. When I have a good idea it tends to flow all-too-freely and I worry that I’ll forget some interesting point I wanted to make. I feel that most (if not all) of those got captured this time, so if this post is still boring then bad luck - that was 100% of what I’ve got to offer. All that was left to do was the wordcrafting around the bullet points. I still enjoy that part, but it was reassuring to have the bullet points to guide my thoughts. It was also extremely helpful to have a bunch of recent resources at my disposal to link to - there are hopefully more useful links scattered throughout this post than usual. If you find them overwhelming, though - I made this javascript snippet to help you focus.

I found some comfort in the idea that my notes are for my eyes only - they don’t need to be any more polished than I want them to be because they’re not to be seen by anyone else. This also meant that I felt more comfortable leaving these ‘draft’ notes in a ‘draft’ state - I added to the notes for this post several times over a week (part of why this post is so long). The original reason for using this hosting service (micro.blog) was to be able to write faster - to get the ideas I had from my head to the world wide web as fast as possible, without going through the pull - write - build - push -check - merge cycle that my main blog requires (Rmarkdown to markdown to html to Netlify via git).

I no longer need to worry (not that I stress over things) that I’ll forget a useful point - it’s stored in my second brain. I’ve started doing this for all sorts of things I previously would simply “hope to remember later”. Thanks to the sync functionality of Obsidian, my second brain is always in my pocket and on all my desktops. I wrote this post in Obsidian because I know it will be synced to all my devices.

I’ve had more than a couple of instances since I started focusing on notetaking where I’ve dropped everything to add a note or braindump. Sometimes it’s been directly into Obsidian, other times on paper or my Scribe. One venue I haven’t quite figured out yet is the shower - I have a large fraction of my best ideas or debugging solutions come to me in the shower or on a walk. This isn’t uncommon, I believe - in 10 Things Software Developers Should Learn about Learning they discuss “spreading activation” and the likelihood of triggering linkages when not actively thinking about a topic. Maybe I need a waterproof tablet device? If you ask me, workplaces enforcing a ‘return to office’ policy should encourage “debugging showers”; it’s certainly useful for me working from home. Actually, it looks like I’m not the first to have this idea; there is a waterproof notepad that I might have to try out.

I likely need to get more comfortable taking audio notes both at home and on-the-go. I’ve figured out how to get my smartwatch to record these, but Obsidian has a native recording facility that I am yet to try out. I haven’t explored if it can transcribe those notes, either.

There are a few other plugins I’m keen to see developed - top of my list is a workable LLM to enable me to “ask” questions of my notes. For a while now I’ve wondered if it’s possible for an “AI” to find links between ideas I haven’t identified. I’m not quite sure what I anticipate that to look like, but perhaps “you have notes on both iGAN and obesity - a note on Akkermansia_muciniphila would connect these two.”

So far, all of the local LLM plugins I’ve tried which are able to serve up all the markdown files are either ridiculously slow or fail at the simplest requests. I’m hoping this improves soon.

I’m curious if it’s useful to do spaced repetition on some of the notes I create - certainly revisiting them at a later point in time - with added life experience, new takes/perspectives, new opinions - would lead to a refined/expanded summary, but how to invoke that? I’ve managed to build a dataview search for ‘notes created in the last 7 days’ which I’ll use for a weekly review, but beyond that I may never see any given note again. I added the ‘open a random note’ tool and occasionally see what it surfaces.

All of the above was a roundabout way of asking the people who have been using these tools for longer - how do you extract knowledge from your “second brain”? with a follow-up of “what did you put in place that best enabled that?” What didn’t work?

If Obsidian is a “Second Brain” then perhaps other people can be a “third brain”, but you have to ask.

Comments/suggestions/advice is most welcome either here directly, on Mastodon (I’m @jonocarroll@fosstodon.org), or via email (hello@jcarroll.com.au).