Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog.
Realizing I completely forget about this week's edition, mostly becasue I didn't really set aside time for exploration this week. I'm making up for it today though. I also ended up posting on Threads (Threading?) about some of the interesting things I learned this week.
I've been using a small bpftrace program to trace builds and processes recently (based off an HN article that talked about this) -- but using bpf instead of ptrace to hook into system fork, exec and exit calls. And then post processing the text output into perfetto traces.
In retrospect, I should probably rewrite it in bcc, or figure out the right way to trigger systing (and generally publish much simpler output from the kernel); till that point though I ran into issues printing out arguments received by the commands because join(args->argv)
would lead to mysterious newlines and interleaved output. I also realized that multiple printfs in a bpftrace program will interleave, making it almost impossible to parse the data.
The final solution I came up with (in collaboration with ChatGPT of course) was to use buf
and %r
to print hex dumps of the arguments instead; and explicitly splitting the strings with a \0
with printf. In Python I can then easily post process it and read the value back.
.pth
files are these magical, somewhat dangerous files that you can drop into /site-packages
in your Python environment of choice (venv, conda, local install, etc.) and will be considered on every single import.
The rules for .pth
files are simple:
sys.path
if they existimport
followed by whitespaceThese lines are exectude on every single Python startup, and obviously end up being extremely powerful (and easy to abuse). This is also how editable
installs work -- they make a .pth
entry pointing to the actual source code -- or symlinks to the actual source code mimicing the expected structure and drop it into the nev.
I've used these in the past to set up conditional patching: when importing foo
, then apply my custom functions to monkey patch certain parts of code -- and only do it if foo
is actually imported. This time around I wanted to make a wheel that would install a .pth
file and spent quite a bit of time with both Claude and ChatGPT trying to figure this out; both of which took me down rabbit holes that vaguely mapped to answers in Stack Overflow and significantly more industrial strength solutions on Reddit.
After trying to fight setup tools I ended up reading the wheel spec and realizing that this could be a lot easier if I just made my own wheel without relying on heavy setuptools
machinery -- I just wanted a 2 line file copied into the wheel to point to software I would then distribute independently of the virtual environment. And it worked, in less than 50 lines of code! The wheel format is very minimal and easy to work with, which helped a lot.
[Aside: Python hunter looks pretty fascinating an I need to spend some time playing with it]
I've been thinking more deeply about how I'd actually like to be spending my time these days: experimenting with LLMs, building ways to apply them in my daily life, particularly being able to customize them and figure out the dynamics of both training and inference -- including the underlying technology, building up my fundamentals significantly to build much more intuition around the architectural choices, the behavior of non-linearities and other systems. Generally writing about all of this and explaining it to my satisfaction, with working prototypes. And generally building up my programming language (sic) to do all the exploration and code generation.
To that end, I'll be putting together a new section on this site with an index of things I'd like to learn, and then both links to resources and explanations and prototypes in my own words. I expect this to be maddening, overwhelming and generally deeply satisfying.
Of all the ways I've chosen to write, newsletters somehow seem the easiest way to just get started, right after one-liners on Threads (or Twitter, before). Writing just a small snippet on Python isn't particularly satisfying, but writing it as part of a weekly update seems to stick; particularly because it feels much more ephemeral and bounded.
So I'll continue writing these as I explore more formal structures to think and learn.
— Kunal