This is the second in a new series describing ark, a command-line-based personal archive system I have developed and am using as my primary archiving tool. If you are not interested in this type of tech post, feel free to skip it. For those who are interested, a plan to provide a new entry in the series each Tuesday for the next 13 weeks.
“Pivot! Pivot! PIVOT!” — Ross Geller
I had planned to write about ark‘s store and how it manages the items in the archive today. But it occurred to me that in writing about ark last week, I didn’t really show any of the system. Instead, I told a lot about the system. And as they say in the writing world: show, don’t tell. So this week I’ve opted to show what ark looks like so you get a tour before seeing the plumbing. Part 1 made the case for the archivist’s lens. Part 2 shows what it looks like. I’ve cast this as seven “scenes” through a day — not necessarily a literal day, but seven realistic moments, stitched into one. The post looks long only because of the archive output for the examples I’ve provided.
Scene 1 – Morning: Catching Up
ark has a wide range of items throughout my life. I need an easy way to see what’s there for a given day, and this is where the ark day command comes in handy. Here is what it looks like for the day before I began writing this post:
ark day yesterday# Saturday, May 30, 2026 · yesterday📍 Arlington, VA · 🌤 Partly cloudy 73°/63°F💪 7,405 steps 6h 37m sleep 72 bpm HRV 38 ms SpO2 95.4% VO2max 35.3## Calendar (2 events)[1] dare to dream *Family*[2] zach soccer *Family*## Diary (1 entry)[3] Evening Notes A beautiful day today. From the time I woke up and headed out for my walk, through dinner on the deck with Kelly and the…## Reading (3)[4] *The Complete Essays* by Michel de Montaigne (started today)[5] *In Memory Yet Green* by Isaac Asimov (finished today ✓) ✎4[6] *Proto* by Laura Spinney (started 2 days ago)## Reminders (1 open)[7] Check into my flight *My Tasks* ⚠ overdue 1 day## Email (37 messages — none from known contacts, 30 filtered)… 37 other messages (use --all-emails to see all)## People[8] Kelly R. (in 3 documents today)[9] Eric S. (in 2 documents today)[10] Lisa K. (in 1 document today)[11] Zachary R. (in 1 document today)## Messages (7 messages)[12] Kelly R. — (+2 messages)[13] Eric S. — Yeah but mine's not on your... (+1 message, group)[14] Zachary R. — https://www.youtube.com/watch?v=0h2jZtuRuic[15] Lisa K. — Maybe he’s role playing (group)
A few things to note here. First, ark is pulling information from the archive from a variety of sources. The header information contains the date (and how long it has been from that date to the time the command was run). It has my location, which ark captures either explicitly through artifacts in the archive, or through an inferred default location over time. The health information (steps, sleep, etc) comes from Apple Health.
It has seven sections that follow that try to provide a high level picture of the day: what was on the calendar and diary entries lead the way. The numbers ([1]) are sigils that can be used with other ark commands in place of longer document IDs. So I can type ark show @3 to see the full diary entry.
Reading is important to me so I can see what I was reading on that day, including what I started reading and what I finished. Any reminders from Apple Reminders show up for that day. Also email, but note that the email is filtered in this view to show only mail from known contacts, and despite getting 37 messages, none of them were from my known contacts that I’d want appearing here.
There is a list of people I interacted with on that day, or who appear in documents that were ingested on that day, followed by a list of text messages (or message groups) that took place on that day.
The command provides an easy way to get a view of any day in the archive. I could also have ark summarize the day for me via ark day 2026-05-30 --summarize. Like most ark commands, there are plenty of options for filtering and controlling the output. Here is what ark day --help looks like:
usage: ark day [-h] [--on-this-day] [--deep] [--all-emails] [--all-calendars] [--verbose] [--full] [--format {table,json}] [--bundle] [--summarize] [--regenerate] [DATE]Compact daily digest — every item gets a [N] sigil for follow-on commands.positional arguments: DATE Date to view (YYYY-MM-DD, 'today', 'yesterday'). Default: todayoptions: -h, --help show this help message and exit --on-this-day Include documents from the same date in other years --deep Include low-priority items (music plays, browser visits) --all-emails Show all emails, ignoring contact and blocklist filters --all-calendars Show events from all calendars, ignoring the day.calendar.include filter --verbose Full narrative output (old default behavior) --full Show full document text instead of excerpts (--verbose mode) --format {table,json} Output format: table (default) or json --bundle Output as ark bundle format (pipe to ark task) --summarize Display or generate a narrative summary of the day via LLM --regenerate Force regeneration of the day summary (implies --summarize)Subcommands: ark day rank [DATE] <-2..+2> [--note NOTE] # set the Jim-Collins-style day-rank ark day rank [DATE] --clear # remove a rank ark day rank [DATE] # show the rank for a date ark day ranks [--since 30d|--year YYYY] # chart + summary of recorded ranksSee 'ark day rank --help' or 'ark day ranks --help' for details.
Scene 2 – Triage: What Wants Attention
With a critical mass in the archive, and new items always coming in, it was a logical next step to make use of all of that information. After reviewing the day, I might make use of ark‘s “stewardship” layer. ark reviews items in the archive, as well as items coming into the archive, and identifies things that require my attention or action. From the CLI I can get a digest of this as follows:
ark prepare digest# Stewardship digest — 2026-05-31_act today: 0 · aging: 2 · outbox: 0 · stalled: 0 · review: 1_**Yesterday (2026-05-30) unranked.** Rate it: `ark day rank yesterday <-2..+2>` (or skip — auto-clears once ranked).## Aging into Q1- **[orange] Pay Arlington County utilities bill due 06/16/26** — Arlington County Utilities Services Office age *10d* · Q2 · *reminder* (`ark prepare show 722866`)- **[orange] Complete patient experience survey for my doctor's office** — (medical provider) age *8d* · Q4 · *reminder* (`ark prepare show 724994`)## Review (low-confidence classifications)- **Prepare for upcoming trip to Boston per American Airlines reminder** — American Airlines conf *0.70* · *reminder* (`ark prepare show 727451`)
or in TUI:

Here, I can see at a glance things that require my attention and take action on them1. ark uses the Eisenhower matrix to put actions into quadrants around (1) Do First, (2) Schedule, (3) Delegate, and (4) Delete. That is what those Q1, Q2, etc., represent.
ark knows how to dedupe these items as they come into the archive so that if I get 3 reminder emails about the same thing, the thing only shows up once in the digest (even though all 3 emails are stored in the archive).
Working through the day, ark prepare digest is a nice follow-up to looking at the day or the previous day to get the context of what happened and what needs to happen.
Scene 3 – Looking Back: Finding What’s In the Archive
Let’s face it, a big reason for having an archive is so that you can search it. ark has sophisticated search capabilities that allow for a rich search experience. First, ark can do full-text searches of anything in the archive, whether text, PDFs, Office documents, email, text message, annotations, even photos. That’s because when something enters the archive it it goes through a process that “enriches” what comes in making anything searchable.
Let’s start with a regular full-text search. Here, I am going to search for items in the archive with the words “tax 2025”. The other parameters in the command tell it to limit the results to 10 items, to leave out snippets of the items, and to show how long it took the search to run:
ark search "tax 2025" --limit 10 --no-snippet --timingSearching: 'tax 2025'[1] 557281 imessage 2025-12-19 iMessage to High School Gang — 2025-12-19: He can now claim his tax-free…[2] 151377 pdf 2025-08-09 Holiday Inn Express Neptune Receipt — 2025-08-07[3] 152738 pdf 2025-01-07 Holiday Inn Express Florence Hotel Receipt — 2025-01-06[4] 413583 pdf 2026-03-01 2025 St. Ann Faith Formation Donation Tax Statement[5] 149508 pdf 2026-01-02 Holiday Inn Express and Suites Boynton Beach West — 2025-12-31[6] 151497 pdf 2025-08-02 Holiday Inn Express and Suites Dickson City-Scranton Receipt — 2025-08-01[7] 165270 email 2026-03-01 2025 Tax documents have been uploaded![8] 165269 email 2026-03-01 Re: 2025 Tax documents have been uploaded![9] 413659 pdf 2026-03-25 2025 Tax Summary[10] 146864 email 2026-03-19 Prepare for tax time - access your 2025 prescription spending report# 10 results# query: 295ms
Not a bad mix of items, but probably, not quite was I was looking for, either. The search ran in about quarter of a second, but the results were meh. I was really looking for tax-related documents, not documents with the words “tax” and “2025” in them.
This is where a semantic search comes in. Semantic searches allow searching for meaning.
ark search "documents I'll need at year's end" --semantic --from 2026-01-01 --timing --limit 10Semantic search: "documents I'll need at year's end"[1] 148828 email 2026-01-23 Your 2025 tax documents are ready[2] 413602 pdf 2026-03-01 2025 Tax Organizer - Completed[3] 165270 email 2026-03-01 2025 Tax documents have been uploaded![4] 708408 pdf 2026-05-12 Brokerage Tax Package 10 — 2024 Tax Year[5] 145936 email 2026-01-08 Important information about the year-end tax statement for your ho…[6] 165269 email 2026-03-01 Re: 2025 Tax documents have been uploaded![7] 165261 email 2026-03-26 Re: 2025 Tax documents have been uploaded![8] 614083 calendar_event 2026-02-11 discuss CDMS remaining use cases[9] 413598 pdf 2026-02-04 Form 1099-G — 2024 Tax Year[10] 413659 pdf 2026-03-25 2025 Tax Summary# 10 results# query: 1052ms
In this case, I searched for “documents I’ll need at the end of the year” added a filter from the beginning of the year, and added the --semantic parameter to tell ark do perform a semantic search. The results are much closer to what I was looking for. They are tax documents or tax-adjacent documents when compared with the simple full-text search. The trade-off is (a) more time during ingestion to give each item embeddings2 that can be used by semantic search, and (b) a slightly longer time to execute the query (because it uses AI) — just over 1 second compared to 0.250 seconds. But even against an archive with 700,000 items, it is a negligible change in speed and the value of the quality of the search far outweighs it.
Scene 4: A Conversation with My Archive via Claude Desktop (MCP)
With more than 7,300 posts here written over 20+ years, I can’t always remember if I’ve written about something before. For instance, as I write this, I am about to head up to Boston for a conference. In reviewing the profiles of the various attendees, I couldn’t help being bemused by the wild garden of jargon on display. Immediately I thought: great topic for the blog. But I’ve written a lot over the years and it was entirely possible I’d written about this before. Since all of my blog posts are captured in ark, I decided to check there first.
One of the great things about the design of the archive is that many of the commands in the system are exposed as tools to LLMs via an MCP server. That means I don’t always have to sit at the command line. I can go into Claude Desktop, for instance, and ask it directly: “I want to write a blog post about how business jargon annoys me. What have I written about jargon in the past?”

If you can’t read the screenshot, here is what Claude said in answer to my question:
Searched memory
The conversation search didn’t turn up anything specifically about jargon. Let me try the ark search directly, since that’s where your blog posts live.Used ark integration, loaded tools
There’s a very promising hit: “Meaningless Statements, Vague Gibberish, and Useless Qualifications” (2021) and “What not to do in a resume” (2006). Also that “AI jargon” email from 2025. Let me pull the relevant ones.Used ark integration
You’ve got two solid posts in the archive that directly touch on jargon. Here’s what I found:“Meaningless Statements, Vague Gibberish, and Useless Qualifications” (March 10, 2021) — Your most directly relevant piece. It was prompted by a spam email pitching IT services that was so thick with buzzwords it meant nothing. You used the Foundation passage where Holk’s symbolic logic analysis of a politician’s speech finds, after eliminating all the meaningless statements and vague gibberish, that nothing is left. You particularly skewered “talk synergies,” looked it up in Merriam-Webster, and imagined an email parser that auto-replies with Strunk & White’s Rule 17: Omit needless words. The post got three comments.
“What not to do in a resume” (November 30, 2006) — This is a broader piece about resume writing, but item #4 is explicitly about jargon: “Don’t overuse jargon.” You wrote that you grow annoyed when bullet points read like a foreign language, and gave the example “Implemented best-of-breed, open-standards ITIL-based WEP-encryption application” as something that sounds ridiculous even if true.
So you’ve touched on this twice — once obliquely (resumes) and once head-on (the spam email). The 2021 post has the richest material, with the Asimov hook being particularly strong. For a new blog post, you might want to build on that Foundation framing more broadly, or shift the lens from spam email specifically to workplace communication generally. The 2021 post was reactive; a new one could be more systematic — here’s what jargon actually costs us, here are the worst offenders, here’s what plain language looks like instead.
So right away, I know that I’ve already written about jargon on the blog at least twice before, the last time in 2021 — so maybe it is time for something new.
I wanted to point out the line where Claude said “Used ark integration, loaded tools.” When Claude couldn’t find anything in our conversations about jargon, it immediately knew to use the ark tool provided via the MCP integration to check the archive, and it used the correct search terms to find previous blog posts about jargon. All I had to do was ask my question.
This is easier to use than the search commands on the CLI, but they also serve different purposes. The CLI tools can be used in conjunction with other tools and commands through pipes, redirects, and scripts. Using Claude lets me interact with the archive as part of other work I am doing. It also allows me to follow up on things. For instance, I might wonder how my writing style changes in the 15 years that separates the two posts.

It is cut-off in the screenshot, but here is the short version at the end of Claude’s response:
Short version: 2006 Jamie is a practical manager sharing hard-won tips. 2021 Jamie is a writer using an annoyance as a lens to say something about language, meaning, and the value of clarity — and reaching for Asimov to make the point stick.
Keep in mind, I could do this with any of the 700,000 documents in the archive. That said, there are some limits with the MCP, all intentionally imposed.
- All documents in the archive get a sensitivity level. Any documents with a “Restricted” sensitivity are never returned to the MCP. These documents never leave the local machine. If I want LLM summaries of these documents, I can get them from the CLI, which will fall back to a local LLM that runs on my laptop.
- Documents marked “Sensitive” are swept for sensitive information and that information is redacted before returning it to the MCP. For instance, if an SSN, account number, phone number, etc., appears in a document, those all get replaced with “[REDACTED]” before the document is returned to the MCP.
- All documents are swept for a known list of sensitive information (SSNs, etc.) and are redacted before leaving the local machine, regardless of sensitivity.
Still, I can converse with about 90% of the documents that are in the archive, and that can be fun. For instance, I asked Claude what music I was listening to on past May 31sts3, and it came back with this:

The best part about the MCP integration, for me, is that I use Claude as a kind of personal assistant and with access to my archive, Claude is much more effective in doing things for me because it has a rich context to draw from.
Scene 5 – Adding My Voice: The Annotation System
It is one thing to have artifacts in an archive. But real value is added to the archive when you can layer in context to those artifacts. This is where the annotation system comes in. When items are pulled into the archive, they become read-only and cannot be changed4. The annotation system provides a layer atop the item itself to provide context to that item. When showing that item in ark, any annotations that are associated with the item are displayed as part of the output. Annotations are available in searches, and there are multiple ways of making annotations.
- Directly through the CLI via the
ark annotatecommand. This command uses the default editor set in your shell5. - Redirected into the
ark annotatecommand. For instance, if I made a bunch of notes in a files callednotes.txt, I could do this as the command line:cat notes.txt | ark annotate document 1234to add the annotation to document 1234. - Via the MCP. Yes, I can ask Claude to annotate for me.
I have a book club meeting this evening. Our book was Wild Dark Shore by Charlotte McConaghy. ark captures my books and reading so I can search for the book in ark as follows:
ark search "wild dark shore" --type bookSearching: 'wild dark shore'[1] 725665 book Wild Dark Shore ✎2 ↳ Wild Dark Shore Charlotte McConaghy A family on a remote island. A mysterious woman washed ashore. A rising storm …# 1 result
That [1] on the line with the matching record is a “sigil”. It is a shortcut reference to the item that I can use in future commands (until another search is run) instead of having to provide the full id (725665 in this case) each time I want to refer to it. When a book is added, ark searches sources on the Internet to pull in metadata about the book. When I look at the record via ark show here is what I see:
ark show @1id: 725665title: Wild Dark Shoredoc_type: bookrecord_class: documentdate_authored:date_precision: undateddate_ingested: 2026-05-24T14:29:23.876608+00:00metadata_updated: 2026-05-24T14:31:12.699497+00:00series:sub_series:sensitivity: normalpriority: normalstarred: notags: bookclubslug:physical_location:source:source_path: ark:/books/workstore_path:sha256: 7b3a2e46382a6c857413db215adb76e6192fe47d9f0c2658dd36b7b5406d2366doc_status: activerecord_origin: unknownorigin:last_accessed: 2026-05-31 14:24:32access_count: 8author: Charlotte McConaghypublished: 2025genre: Mysteryisbn: 9789025477356Wild Dark Shore Charlotte McConaghy A family on a remote island. A mysterious woman washed ashore. A rising storm on the horizon.Dominic Salt and his three children are caretakers of Shearwater, a tiny island not far from Antarctica. Home to the world’s largest seed bank, Shearwater was once full of researchers, but with sea levels rising, the Salts are now its final inhabitants. Until, during the worst storm the island has ever seen, a woman mysteriously washes ashore.Isolation has taken its toll on the Salts, but as they nurse the woman, Rowan, back to strength, it begins to feel like she might just be what they need. Rowan, long accustomed to protecting herself, starts imagining a future where she could belong to someone again.But Rowan isn’t telling the whole truth about why she set out for Shearwater. And when she discovers sabotaged radios and a freshly dug grave, she realizes Dominic is keeping his own secrets. As the storms on Shearwater gather force, they all must decide if they can trust each other enough to protect the precious seeds in their care before it’s too late―and if they can finally put the tragedies of the past behind them to create something new, together.A novel of breathtaking twists, dizzying beauty, and ferocious love, Wild Dark Shore is about the impossible choices we make to protect the people we love, even as the world around us disappears. Mystery Thriller Literary Fiction--- annotations ---[23] 2026-05-31 11:21 Initial thoughts after reading:- characters: Rowan, Dom, Raff Orly, Fen- a kind of Robinson Crusoe or Swiss Family Robinson about the story, only darker- did a very good job for the first two-thirds at playing out the mystery- last third of the book took too many zigs and zags and was on the verge of becoming parody- loved the setting of the book- wondered why there would only be one place in the world where the seeds would be kept for rebuilding civilization; one would think there would be multiple places; also why one an island as wracked and ravaged as this one.- odd to have one chapter from Alex POV -- it throws off the rhythm of the story; as if author could not come up with any other way of conveying that info[24] 2026-05-31 11:30 **Book Club Discussion Notes — May 31, 2026***Initial reactions:* Strong first two-thirds that sustains mystery well; final third overloaded with reversals, verging on parody. The island setting is the book's most consistently successful element, grounded in McConaghy's research on the real Macquarie Island....
After all of that metadata for the book record, note the section called --- annotations ---. This is where I have gone and added annotations to this particular book record. I’ve added two, as you can see, but I can add as many as I want. These are not part of the book record, but rather are linked to the book record. The ark show command is pulling all of this together when it shows the book to me.
Adding the annotation is as simple as running the command. Or, asking Claude to do it for you.

Here, not only was Claude easily able to find what I was looking for, but once found, it summarized it as requested, and then added the annotation that I asked for. For me, this turned annotation from a chore into something easy. I can sit with my book and as I read, tell Claude to add various annotations to the book for the thoughts or passages I want to comment on.
Scene 6 – ark quality: The Archive That Knows Where it is Weak
Archivists work to ensure the quality of the archives they manage. That not only means preserving the archive but enriching it with as much information as they can about the contents of the archive. Each item in ark has a certain set of metadata. It has the date of the item, the date it came into the archive, the source it came from, its location in the real world (if it is not digital in the first place). Each item has a title, a series and sub-series, a format, a sensitivity level. Items can have OCR data, embeddings, generated descriptions, and more. With all of that possible data surrounding an item in the archive, some way is needed to assess the quality of items in the archive.
Enter the ark quality command.
This command provides a way of looking at the archive as a whole seeing the overall quality of items. It also allows for looking at a specific item and seeing the quality of that item. Quality is a score between 0-100 that is a measure of how complete the picture of the item (or archive) is. An item with no metadata at all rates a 0. An item with every possible element filled in rates 100. Here is a quality summary of the entire archive.
ark quality summary=== ark quality — Archive Completeness === Total documents: 681,576 Median score: 76/100 Score distribution: 0-25 11,116 (1.6%) 26-50 ███ 85,298 (12.5%) 51-75 ███████ 178,002 (26.1%) 76-100 █████████████████ 407,160 (59.7%) Median score by type: attachment ████ 24/100 subscription ███████ 35/100 reading_started ████████ 40/100 calendar_event █████████ 47/100 cli_command █████████ 47/100 reminder █████████ 47/100 note ██████████ 53/100 video ██████████ 53/100 reading_finished ████████████ 60/100 code_file █████████████ 65/100 conversation █████████████ 65/100 imessage █████████████ 65/100 outbox_draft █████████████ 65/100 text █████████████ 65/100 image ██████████████ 70/100 browser_visit ██████████████ 73/100 timeline_event ██████████████ 73/100 watch_event ██████████████ 73/100 blog_comment ███████████████ 76/100 blog_page ███████████████ 76/100 book ███████████████ 76/100 day_summary ███████████████ 76/100 email ███████████████ 76/100 facebook-post ███████████████ 76/100 message ███████████████ 76/100 office ███████████████ 76/100 pdf ███████████████ 76/100 purchase ███████████████ 76/100 review ███████████████ 76/100 tweet ███████████████ 76/100 git_commit █████████████████ 87/100 music_play █████████████████ 87/100 weather_snapshot █████████████████ 87/100 action_item █████████████████ 88/100 blog_post █████████████████ 88/100 health_day █████████████████ 88/100 playlist █████████████████ 88/100 diary_entry ████████████████████ 100/100
Here you can see that about 60% of the items in the archive fall into the top quadrant of quality. The highest quality class of items are diary entries. The lowest quality are attachments. This makes sense because attachments are pulled in from email messages and outside of getting descriptions from vision models and embeddings on those descriptions, they have almost no other metadata.
Anything that comes into the archive has a title. Often times it is simply the name of the file that was ingested and we all know that filenames can be ambiguous or provide little useful information about what the file contains. ark can assess the quality of titles as well, for instance:
ark quality titles --limit 20=== ark quality titles — 20 document(s) with bad titles === ID Reason Type Date Title ─────── ────────────────── ────────────── ────────── ────────────────────────────────────────────────── 3792 filename w/ ext attachment — Suggestions_for_OS_8_image.doc 4048 filename w/ ext attachment — Nuv.doc 4245 filename w/ ext attachment — database.doc 4326 filename w/ ext attachment — OS_8_seminar_outline.doc 5032 filename w/ ext attachment — Proposal.doc 5342 filename w/ ext attachment — g970527.doc 5343 filename w/ ext attachment — g970515.doc 9101 filename w/ ext attachment — updaterr.doc 11223 filename w/ ext attachment — REV980617ACOSTEDBOM1.XLS 11608 filename w/ ext attachment — adopters2.xls 11937 filename w/ ext attachment — Development.xls 12092 filename w/ ext attachment — kpatel.doc 12859 filename w/ ext attachment — BDY.RTF 14769 filename w/ ext attachment — 98issues.doc 15137 filename w/ ext attachment — michael.doc 15326 filename w/ ext attachment — Invoice1.xls 16099 filename w/ ext attachment — Dun12.doc 16200 filename w/ ext attachment — Lanrover.doc 16533 filename w/ ext attachment — %adopters.xls 16534 filename w/ ext attachment — adopters.xls 20 document(s) shown. Use 'ark enrich titles' to fix with LLM-proposed replacements.
Note that in addition to listing items in the archive with poor quality titles, the command also tells you how you can raise the quality of those titles using ark enrich titles. Automated enrichment is a key element of ark. It uses LLMs to read the document and give a descriptive title to the document based on the context. Each document class has its own template of title format. So for instance a letter might have the format “Sender-to-recipient (date)” and the LLM will produce titles in that format so that they are consistent across that class of items, raising the quality and consistency of the archive.
ark also uses a controlled vocabulary for its metadata like series and sub-series. When something is ingested it may get a series that falls outside that controlled vocabulary. ark quality vocab can help to identify such items:
ark quality vocab=== ark quality vocab — Controlled Vocabulary Check === Vocabulary: 15 series configured Violations: 4 unknown series value(s), 10 unknown sub-series value(s) Unknown series (4): Series Docs ──────────────────────── ────── health 9392 shopping 5813 diary 624 environmental 43 Unknown sub-series (10): Series Sub-series Docs ────────────────── ──────────────────── ────── media watch 26149 shopping digital 2642 (unknown series) shopping audible 2049 (unknown series) shopping retail 1122 (unknown series) media reviews 83 correspondence amazon 22 legal identity 11 legal power-of-attorney 1 legal will 1 professional systems administration 1 Sample documents with unknown series (up to 10): ID Series Sub-series Title ─────── ────────────────── ──────────────── ──────────────────────────────────────── 728106 environmental — Weather: Arlington, VA on May 30, 2026 728102 health — Fitbit: May 30, 2026 727759 environmental — Weather: Arlington, VA on May 29, 2026 727758 health — Health: May 30, 2026 727751 health — Fitbit: May 29, 2026 727457 environmental — Weather: Arlington, VA on May 28, 2026 727456 health — Health: May 29, 2026 727444 health — Fitbit: May 28, 2026 727087 environmental — Weather: Arlington, VA on May 27, 2026 727086 health — Health: May 28, 2026 Fix: ark set document <id> --series <valid-series> Or: ark normalize-series (if renaming a whole class)
Once identified, ark provides commands for automatically fixing the items, either manually, or across the entire series.
All of this helps to raise the quality of the archive, making searching more accurate.
Scene 7 – Closing the Day
We’ll conclude the tour with one way of looking at the work I got done yesterday. This is a good example because it shows how ark was really designed with core Unix principles, each command doing something the output of which could be used as input for other commands.
Let’s begin by looking at what the archive ingested from yesterday. The easy way to do this is as following (I’m limiting this to just 10 entries for space):
ark recent --days 1 --limit 10[1] 728102 health_day 2026-05-31 2026-05-30 Fitbit: May 30, 2026[2] 728101 diary_entry 2026-05-31 2026-05-30 Evening Notes[3] 728100 git_commit 2026-05-31 2026-05-30 feat(cli): auto-follow supersedes chain to head on reads…[4] 728099 git_commit 2026-05-31 2026-05-30 feat(search): identify multi-version docs via --has-vers…[5] 728098 git_commit 2026-05-31 2026-05-30 feat(vim): :BlogShot — SnagIt → Vim → WordPress screensh…[6] 728097 git_commit 2026-05-31 2026-05-30 fix(reading): ark reading finish <id|@N> stored raw CLI …[7] 728096 git_commit 2026-05-31 2026-05-30 fix(blog): standalone images publish as wp:image blocks;…[8] 728095 outbox_draft 2026-05-31 2026-05-31 Attempts ✎1[9] 728094 playlist 2026-05-31 2026-05-31 Recently Added (v5)[10] 728089 music_play 2026-05-31 2026-05-28 On Your Shore — Enya# 10 results
But if we wanted to know the total number of things ingested from yesterday we can do this:
ark recent --days 1 --limit 200 | wc -l# 81 results 82
Next, we can use the ark bundle command to generate a “bundle” of one or more documents from ark. This one bundle would contain the contents of all of the documents we put into it. So for instance, if we wanted to generate a bundle with the full contents of everything that came into ark yesterday we could do this:
ark recent --days 1 --limit 200 | ark bundle
which would produce output containing the contents of all 81 items ingested yesterday — a total of 799 lines so I won’t show them all here.
Finally, we can redirect that bundle to an ark task. A task is a prompt for an LLM to do something with the data. We’ll use my “shipped-yesterday” task, which summarizes everything I “shipped” (code commits, posts written, emails sent, etc.) Putting it all together, it looks like this:
ark recent --days 1 --limit 200 | ark bundle | ark task shipped-yesterday# 81 resultsark: warning: Redacted 1 sensitive pattern(s) before sending.Running task: 'shipped-yesterday' (standard)## May 30, 2026 — Daily Output Report### ShippedFive commits to the **ark** repo:- feat(cli): auto-follow supersedes chain to head on reads (#94 Phase 1); bump to 1.68.0 (728100)- feat(search): identify multi-version docs via --has-versions + (vN of M) annotation (#95); bump to 1.69.0 (728099)- feat(vim): :BlogShot — SnagIt → Vim → WordPress screenshot workflow; bump to 1.69.1 (728098)- fix(reading): ark reading finish stored raw CLI arg as event title; bump to 1.69.2 (728097)- fix(blog): standalone images publish as wp:image blocks; caption via title (#96); bump to 1.69.3 (728096)### Written**Diary** — "Evening Notes" (May 30) covers a pleasant day at home, mentions writing an internal company blog post on AI use, notes a traffic spike (2,300+ views) on his public blog, and reflects on starting Montaigne's *Essays* and a desire to re-read *The Stand*.**Self-email** — "Tomorrow's Plan — Sunday, May 31, 2026" sent from j**.**@icloud.com to himself; a detailed day-plan covering the DCA→BOS flight, the dare-to-dream volunteer event, open action items, and upcoming occasions.### In Flight**"Attempts"** (outbox draft, WordPress destination) — a personal essay connecting the etymology of "essay" to the author's own writing history, from junior high through college, prompted by a morning walk and an archive search. The draft is complete in the outbox but not yet published.
The first thing to note is the warning that ark gives on that second line after the command:
ark: warning: Redacted 1 sensitive pattern(s) before sending
This is an example of what I was talking about earlier. Sensitive information never leaves the local machine; it is either blocked, or, as in this case, redacted before sending it to an LLM.
The rest summarizes the work I did yesterday, based on how I specified the way I wanted it summarized in the “shipped-yesterday” task.
Seven scenes, but really seven angles on one habit. Most days I use four or five of these without thinking about it: the archive surfaces the day, I notice what wants attention, I look for something I half-remember, I add a note to something I just read, I check what shipped. None of those scenes are unique to ark taken in isolation — search engines search, task managers triage, note apps annotate. What’s unusual is that they sit on top of the same archive, share the same discipline, and the parts compound. That’s the payoff Part 1 promised the archivist’s lens would deliver.
The next post takes the floor out from under all of this and shows what holds it up — ark‘s data model. A single SQLite database, a clean graph connecting documents to people to one another, and a schema disciplined enough that it should still answer the same questions thirty years from now. Plumbing always sounds duller than the demo, but it’s where this stops being a pile of clever commands and starts being an archive built to last.
- Eventually, `ark` will be able to take some of these actions for me! ↩︎
- An embedding is a way of converting text (or other data) into a list of numbers — a vector — that captures its meaning. The key property is that things with similar meanings end up with similar numbers, so you can do math on meaning. When you ask an LLM a question, your question is turned into an embedding and that math is applied to see how “similar” it is to other embeddings. ↩︎
- Yes, `ark` has a history of everything I’ve listened to (via Apple Music) and watched (via YouTube, Amazon, Netflix and a variety of sources). More on this in the next post in the series. ↩︎
- They can be versioned and superseded, but that is a topic for a future post. ↩︎
- MacVim in my case. ↩︎
Comments