Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog.

2024-04-21

LLaMa3

I've been helping out with infrastructure and tools for training LLaMa3 at Meta: I'm very happy to be able to help because I think having something of LLaMa's quality easily available for hacking is one of the things that will shape how LLMs are generally applied and used, and contributing to that is very satisfying. I'm even in the model card, along with some very well known people.

At the same time, I could use ellama, ollama and LLaMa 8b to have my very own local LLM -- which has been fairly helpful. Ellama's prompts around reviewing highlighted code, implementing functions, etc. is exactly what I'd dreamt of a long time ago and hadn't expected to be true so soon. The UX is still a bit rough and generating tokens on my laptop CPU is slow, but I expect that to constantly, inexorably improve the way things have been going.

I'm not thinking about finetuning / distilling a LLaMa model down to something that can translate CLI commands on my behalf; eg. "extract this tarfile". I think it should be very doable -- and maybe a good excuse to learn torchtune -- but I need more time and energy.

Python & Emacs

As part of consolidating my .emacs I've been cleaning up my Python setup as well. I rebuilt and moved to the latest on Emacs's master branch -- the fact that I can smoothly run on Emacs master always amazes me -- and set up Jedi, Ruff (using LSPs), while relying on some existing snippets for devdocs.io and Ellama integration.

All of this means I get some very cool auto completion, almost instant error & syntax checking and warnings with minimal setup or dependency on the repository I'm editing.

I still have some trouble with both Auto-complete mode and Company mode turning on and both trying to complete what I'm typing; I'll dig in some more and start publishing my configurations:

Penzai

JAX released some very interesting tools, including a visualization tool that is almost exactly like what I was hoping to see with PyTorch. This also makes it much easier to explain -- though I think i'd probably go with a little bit more whitespace in the UI if I was designing it -- and seems pretty powerful.

I need to find the time to hack on this and actually make an interactive UI or CLI around it. And a top or below like interface to TensorBoard.

Wax, and languages

Continuing the theme of looking for lisp-like homoiconic languages that compiled down to C, I ran into some reddit posts and links -- particularly this list of lisp-like languages. There are several interesting ideas in there, but some day I'd like to implement my own, potentially working backwards from the C grammar to make sure everything can easily and cleanly be expressed, and then layering on language sugar on top of that.

As mechanisms for procrastination go, inventing the language to program in before actually getting around to programming seems unfortunately too far up my alley. I'll save this particular project for some slow months.

Stanford Lectures on Transformers

More rough notes from the lectures.

Nathan Lambert, Allen Institute for AI

History

Kunal