Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog.

Birthday Hacking

Retrospective

This was fun, and I'd like to do these much more regularly. That said, I need to balance concrete programming projects with more abstract projects to improve my fundamentals much more carefully -- after a few days of simply studying I started drastically losing motivation; I need to regularly build working, useful programs to feel excited about what I'm doing.

Taking a step away from work was also extremely powerful in realizing that I've been letting myself get distracted by a lot of noise from day-to-day changes over the past several months, and there are several interesting directions I'd like to push in, and deeper projects to build -- in general I feel much more excited about work modulo the organizational overhead and general friction.

2025-08-03

(Afternoon) After a somewhat concentrated sprint on gpu programming in the first 2 days, I diverted myself to look into HRM and read more about it. The paper introduced a lot of interesting ideas, and I really liked how they talk about the different things they consider and the techniques applied to use them.

I was able to run the code they shared with a few minor tweaks, but I still have a lot of open questions about how things work, and actually logging all the interesting internal values they logged around the code.

After getting a bit bored of only working on toys on abstract projects, I was also happy to hack a little bit on the notebook site generation; there's a lot more I'd like to try here though.

2025-07-31

(Afternoon) I've reached partway to Chapter 5 at the moment: I'm reasonably confident I'm retaining whatever I've read so far, and have code snippets to prove it. This is a very satisfying break and I'll need to do more of these.

The one part I'm a little worried about is speed, ideally I'd have liked to cover a lot more ground but I'm still basically finding my feet. As long as I can implement flash attention in cuda by Sunday I'll be somewhat satisfied, though I hope I can pull off much more than that.

Having my personal laptop with a small nvidia GPU -- with the latest CUDA installed -- accessible over tailscale from my much more portable laptop has been amazing. I suspect I'd benefit from having a larger workstation sitting at my desk for normal work, but that's for another time.

The one idea that keeps coming back to me is that effective cuda programming seems to mostly be derived from effective constraint solving; I'll work through this book but I keep having the idea that all of this should be automated -- which is tentatively the insight behind PyTorch and the several other frameworks too; I just want them to be significantly more aggressive in just how much they can optimize about the model.

And I find myself reinforced in implementing my personal programming language so I can skip all the boilerplate and have quick abstractions ready to go.

2025-07-30

(Morning) As a way to celebrate my birthday, I'm taking a few days off to learn and program with an eye to spend most of the time playing with Cuda, building some LLMs for training, and generally exploring things I've been curious about. I'd like to work through as much of the GPU Mode lectures as possible, complementing the book on Programming Massively Parallel Processors.

Keeping this log as a way to reflect on how much progress I make, what worked well, and what didn't: I suspect I'll be doing many more of these retreats to satisfy my curiousity and indulge my craft.

Ideally, by the end of these 5 days of holidays I'll be significantly more comfortable with Flash attention, the consequences of different architectures on hardwares, and able to debug cuda performance issues much more fluidly. Tentatively with a massive back log of more projects I'd like to build, a lot of notes and some more essays and posts filled out.

— Kunal