Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog.

2024-04-28

This was a long with a lot of overlapping oncalls; I'll be glad to take a break sometime next week.

At the same time, I was able to learn some new things.

Stanford Lecture

The lecture on MoE this week was fascinating, well delivered and cleared up a bunch of misconceptions I had about what MoE meant and how they functioned. Some of the talks' slides have been uploaded at the CS25 website, but there are still several to go.

Things I remember

Experts are not trained on specific topics
As an experiment on mixtral, someone zero'd out each expert one at a time. Expert 3 had the most effect, and that's not been explained yet.
As a code convention, he suffixed all variables with their dimensions, a practice I'd like to adopt as well.

Transformers

Talked to an old friend after a very long time: he's clearly been doing much more advanced work than I have, and pointed me to several interesting ideas to explore

HogWild!
Applying manifolds to think about models
V-JEPA

I have a lot of math and infrastructure to learn. I'm thinking of playing with a simple transformer and seeing if I can get it to encode/decode some patterns like look & say, and if I can use that to build some intuition about QKV. It should be an interesting exercise.

Go

I finally wrote a small program in Go, and so far have been finding the language surprisingly ergonomic and friendly; particularly with Go routines. I'm planning to build several log parsing tools with Go (and possibly TCell).

I'll need to find a good modern book on Go before I shoot myself in the foot with assumptions about the behavior of the language though.

Zookeeper

I also spent a lot of time learning zookeeper semantics: the original paper was excellent and finally made things click. I could solve the problems I wanted by simply relying on watches, which Kazoo makes even easier.

Djikstra's notes

Partially read notes that floated by on Hacker News, this entry is a reminder to go back and read the rest.

— Kunal