More

srigi · 2026-05-19T20:20:17 1779222017

Not yet, but soon… Bun

srigi · 2026-05-05T20:50:21 1778014221

You could keep multimodal projector (understanding of audio, images & PDFs) in system RAM with `--no-mmproj-offload` in llama.cpp. Of course, then it is not accelerated with GPU, but you save its VRAM.

msp26 · 2026-05-06T08:13:57 1778055237

Interesting, I might try that, thanks!

srigi · 2026-04-23T16:41:44 1776962504

That is why we have discussions like these: https://x.com/i/status/2039099810943304073

tadfisher · 2026-04-23T17:26:48 1776965208

X is the worst place to hold community discussions.

srigi · 2026-03-16T15:54:49 1773676489

The official SW on MacOS is somewhat usable if:

- you disable all communication with a firewall (so it doesn't autoupdate)

- `sudo pkill -9 LogiPluginService && sudo rm -rf /Applications/Utilities/LogiPluginService.app` (so it does not eat resources and don't run a useless service in the bg)

srigi · 2026-01-12T16:37:53 1768235873

"I'm sorry, my knowledge cuttoff is 1875"

srigi · 2025-08-05T22:38:42 1754433522

Start with the YT series on neural nets and LLMs from 3blue1brown

srigi · on May 28, 2025

The most funny thing is how synchronicity worked its magic:

Roo Code experimental code indexing using vector DB dropped 3 days ago. Theire using Tree-sitter (the same as Aider) to parse sources into ASTs and do vector embedding on that product, instead of plaintext.

https://news.ycombinator.com/item?id=44117455

srigi · on May 23, 2025

Yeah, I remember that option - basic tier of DTU DB with 250GB of storage - free for one year, then continue for $15/m.

When the client brought some 3rd party expert and he advised rewriting to MySQL, I quickly did the math and it was like $60/m, without a free year.

We continued with DTU MSSQL with Prisma ORM and never regreted.

srigi · on May 20, 2025

You will not be replaced by AI. You will be replaced by person using AI!

srigi · on May 5, 2025

Can you add a recent build of llama.cpp (arm64) to the results pool? I'm really interested in comparing mlx to llama.cpp, but setting up the mlx seems too difficult for me to do by myself.

SparkyMcUnicorn · on May 5, 2025

I ran them again several times to make sure the results were fair. My previous runs also had a different 30B model loaded in the background that I forgot about.

LM Studio is an easy way to use both mlx and llama.cpp

anemll [0]: ~9.3 tok/sec

mlx [1]: ~50 tok/sec

gguf (llama.cpp b5219) [2]: ~41 tok/sec

[0] https://huggingface.co/anemll/anemll-DeepSeekR1-8B-ctx1024_0...

[1] https://huggingface.co/mlx-community/DeepSeek-R1-Distill-Lla...

[2] (8bit) https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B-...

srigi · on May 5, 2025

Thank you very much.