Collecting Book Notes

March 28, 2023 8 min read

How and why I collect notes from Kindle books.

I’m not a sophisticated reader with tote bags and cats, ripping through stacks of titles from a favorite chair. I can prove it by telling you that I didn’t really start reading for myself until I got a Kindle.

Paper

I read textbooks and novels growing up, and books on design and programming and art and philosophy in college, mostly because I had to. I enjoyed the reading, particularly when I got to Chip Sheffield’s classes on modern art, but I didn’t wander off and read books in my spare time. In these educational contexts, there was always something to respond to or write about—something that forced me to linger and describe what I learned or explored.

But aside from the smell, I didn’t develop a love for those paper things that so many avid readers clearly have.

Some were heavy, some were tattered, some came with notes and scratches and tears and stains from other people. They had character, sure, but they could be awkward to manage with one hand or read sideways. I didn’t always want to broadcast whatever I was reading to people around me who could steal a look at the cover. (Sometimes I did.)

The first Kindle I saw was the debut model, a white wedge with a keyboard that emerged from a co-worker’s handbag. (I’m pretty she’s one of those platinum tier readers.) Her enthusiasm for it led me to buy a later model that was flattened and rounded out, with a more subtle keyboard on its face.

This changed things.

Pretend Paper

I could suddenly carry as many books as I wanted, queue them up right where I’d sit with them, and read comfortably in any strange contortion under bright sun or a moonless night. No distraction, and battery life good enough to be an afterthought.

A later model, the one I currently have, would add waterproofness and a cover that’s grown on me and trade the clunky physical keyboard for a clunky touchscreen keyboard.

This isn’t praise specifically for the Kindle, whose branding I’ve covered with black electrical tape so the only words in front of me are from a book’s author. It took an ungodly amount of time for them to introduce a lovely typeface because they’ve put so much effort into promotions and upsells that thankfully I don’t have to encounter most of the time and that never, ever interrupt reading.

Tightly-cropped closeup photo of the author’s kindle, with crudely-cut black electrical tape where the word “kindle” would be on its face

I read blog posts and online articles on an iPad, which has a lovely display and lets me seamlessly involve audio or video. But it’s awful outside with glare and a propensity to overheat and shut down in the summer. (Just like me, to be fair.)
It can and does interrupt me with notifications, and battery life is something that requires mindfulness and care.

I’m okay with all this.

The only problem with this is that I spend less time at libraries, which is a shame because libraries are some of the most wonderful places on earth.

Invisible Ink

It’s way too easy for me to read a book and forget whatever made it great. If I don’t take notes or write about it, I’m left with only a vague sense of the book that, years later, is barely enough to qualify a thumb up or down.

Somewhere back in my murky personal history of reading, I found Readmill and it was wonderful. Readmill let me track my reading, manage my book lists, and squirrel away notes and highlights. It even gave me stats about my reading habits and trends. It made it easy to find other books, and other people reading similar kinds of books. It felt stylish and meticulous and … bookish? … and delightful at every turn.

We live now in a world without Readmill, because it shut down in 2014. Goodreads occupies a similar place in the universe and it’s an Amazon company, so I’ve declined to play along in the same way I avoid Facebook and LinkedIn and Twitter and those other corporate, ad-riddled social things because I refuse to be any fun.1

But those notes never stopped being important. I did without them for a while and started to get annoyed at what slipped away. I collected all my Readmill history and turned into Markdown files at one point to keep the journey alive on a personal blog that continually changes platforms as I poke at it.

Siren Call

I don’t remember when I first realized it, but my big discovery was that the Kindle keeps every note, highlight, and bookmark in a single plain text file.

That’s a starter pistol for any nerd with a programming language that can manipulate strings.

Other people noticed this and wrote parsers well before me, and there’s even a SaaS offering for it, but the challenge of extracting these notes was too enticing.

I wrote a PHP script for scraping the “clippings” off the Kindle and chopping them up into bite-sized chunks of Markdown I could use to start a blog post. After many interesting failures and some surprising shifts in the Kindle’s formatting, I eventually evolved the thing into a little package I decided to call Dekindler.

I don’t actually know how a grownup engineer writes a parser, and under the hood my work is a series of careful hacks. But after playing with other parsers I noticed that peoples’ example input was different enough that it might not work from one to the next. I’m mostly looking at my own file too, but I wrote tests to try and accommodate whatever strange examples I could find—assuming they were all genuine. (Try out the demo with your own notes and let me know if you’ve got something that doesn’t parse well!)

This is reverse-engineering something that doesn’t have a public spec2, so I guess this is what you do.

Dekindling Notes

I use it like this:

  1. Read a book and make highlights and notes. (Fun part.)
  2. Connect the Kindle to my Mac with a USB cable.
  3. Run the following to save the latest title to my collection of Markdown files:
cd ~/Projects/dekindler
./dekindler extract /path/to/books/ --webSafeFilenames false

This attempts to write all book details to the specified folder, with human-friendly filenames like One Long River of Song.md:

Terminal screenshot showing books skipped and new Markdown files written

By default this skips any identical filenames so we don’t overwrite existing book notes. New files are added following this format:

---
title: One Long River of Song
author: Brian Doyle
---

# One Long River of Song

> To the overlooked and misunderstood, to compassion and grace that conquer all division. To imagination and creativity. May they flow fearlessly and endlessly.

– page 6, location 91-92, 3/7/23 at 11:00pm

> “the bends and layers and implications and insinuations and shimmers of memory.”

– page 9, location 126-126, 3/7/23 at 11:05pm

> Have you ever paid attention to Tolstoy’s language? Enormous sentences, one clause piled on top of another. Do not think this is accidental, that it is a flaw. It is art, and it is achieved through hard work. —Anton Chekhov

– page 9, location 138-140, 3/7/23 at 11:07pm

<!-- (...) -->

I start a new blog post by copying and pasting this file’s contents and adding details specifically for my little blurbs:

---
title: One Long River of Song
author: Brian Doyle
summary: Collection of Brian Doyle’s essays that celebrate his life and work.
rating: 5
state: Finished
pubDate: 2023-03-26
url: https://app.thestorygraph.com/books/7e413656-a291-4041-8a2c-83288e239ea5
---

# One Long River of Song

> To the overlooked and misunderstood, to compassion and grace that conquer all division. To imagination and creativity. May they flow fearlessly and endlessly.

– page 6, location 91-92, 3/7/23 at 11:00pm

<!-- (...) -->

Then I can sit with the clips I took and summarize my thoughts, maybe using some of those quotes. I like reviewing them all even if none make it to the resulting blurb. It can be a chance to take other related notes, seed journal entries, or add mentioned books to my reading list.

It’s been a chance to play with the Pest testing framework and the Symfony Console Component, and fun to tinker with. It can export a single JSON file instead of individual Markdown files, and I’ve been playing with a way to browse collected books and export information for one at a time.

Have a look at the project if you want. I’d love to know what you think!

Footnotes

  1. I recently found The StoryGraph though, so there’s hope! It’s an app from an independent team and I regret that it took me so long to catch on. 

  2. But how embarrassing if it does and I’ve missed it.