Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog.

2024-05-19

An extra unexpected week in San Francisco. I accidentally walked past the tail end of Bay-to-Breakers on the way to getting coffee (right before writing this week's letter) and was thoroughly confused and amused. Seeing several costumed people running past made up for missing out on the Dance Parade in NYC on Saturday.

TermDex

I've been using this a lot, and see a lot of potential, but haven't quite made up the time/energy to actually implement features I'm really looking forward to. Having extremely simple query functionality implemented using a mix of bash scripts and fzf has taken me surprisingly far. As much as I appreciate POSIX, I guess I never really internalized the extremely minimal api it exposed for different programs to connect -- and I still find myself surprised just how much is accomplished through that minimal api.

I'm also working to configure nvim to be a good markdown editor and building a surprisingly pleasant/effective editing experience where I can navigate through text/ideas/notes quickly. THis finally feels like a flexible enough alternative that lets me keep flat files, easily swap between tools and still get all the benefits of Luhmann's methods & Notion & Index Cards and all the other tools I've used to try and keep my head in order.

With LLMs being able to easily consume text, fancier CLIs where things just work should have been much more common than they seem to be today, and is something I expect to start noodling around with soon.

Hierarchies, nested notes

Something I haven't figured out the ergonomics for is having a hierarchy of notes easily: the best cheap alternative I tend to have is to use a custom sheet where I mix in indentation by basically converting the first several columns into thinner indents. So I can show a nested hierarchy by simply starting from a different column and rely on how the UI simply overflows cell text to get nesting in a way that I can easily move rows around.

I'd like to actually build this out as a UI for easy modification and managing relationships between notes.

Information Theory / Chainsaw / TF-IDF

Feeling a little bit lost while playing with TF-IDF and realizing that I was getting extremely broken results because of bugs in implementation but still something that seemed valuable, I wanted to start levelling up in Math a little bit. Based on an answer from ChatGPT-4o I've started reading Elements of Information Theory and generally enjoying the book.

Even revisiting the minimal definitions of entropy (sum of -p . log p) and mutual information between distributions (sum of -p . log p / log q ) I think I can take another stab at finding outlier logs by looking for logfiles that have the most different distribution from the distribution over the norm. I don't need real outliers, I just need the ones with the most distance.

Tokenization is of course critical: I've worked around it for now by simply normalizing a lot of strings: smashing numbers down to a single 0, smashing punctution into a single _, etc. THis obviously reduces a lot of nuance in the logs but does let me find logs with different stack traces much faster. I think I'll take the approach of mutual information to find logs that I should look at, and then look at cosine similarity & clustering (something else I spent time learning) to figure out batches of logs and hosts to work with.

Go

Writing go continues to be pleasant: I definitely miss the rich python ecosystem (and have now started wondering if I could just reuse Python libraries in Go) -- there are so many good libraries apparently made for data mining.

Speculation

As a reminder: none of the notes here represent those of my employer, nor do they include confidential information (not that I expect anyone to ever read this site either). With that out of the way, I've tried to think of what happens if I extrapolate into the future while thinking through what transformers can accomplish without needing them to make tremendous jump in capability (sitting in a cafe in SF seems like the right time to explore these ideas):

Can my phone simply be a text/voice interface that reconfigures itself according to what I ask of it? Do we need servers to have any apis any more -- or if we give a Transformer a structured tool for reading data, and ways to be efficient, can it simply have the behaviors we expect?
- To make this a bit more concrete: instead of thinking through text/email/phone or other mechanism can I contact someone through an AI?
Do the AIs become large behemoths served remotely, with access to everything, or do we end up in a world with a lot of small AIs at an individual level coordinating with each other? What protocols do they apply when they talk to each other?
How much of media, art, knowledge can be customized to the consumer instead of the producer? In the near future, just thinking of smarter games and roguelikes that are much richer because all the NPCs remember context and can react to your behaviour much more explicitly. I have to imagine the creators of Dwarf Fortress are already thinking about the consequences of this.

Most of these seem feasible with current technology with a lot of engineering applied to make things cheaper to deploy, faster, and better integrated -- I'd expect to see a lot of these kinds of applications to pop up within a decade if not sooner (a decade seems extremely conservative if I'm honest).

— Kunal