Sat, 02 May 2026 08:53:32 -0700

I’ve been running Ollama on my Mac Studio for local AI experiments. I followed advice to try oMLX instead and it’s ludicrously faster, like maybe 5-10x for both time to first token and completing the response. I haven’t benchmarked it, but it subjectively feels like when I replaced a hard drive with an SSD.

omlx on Honeypot.net