Update: my current AI setup

I updated my AI setup.

Because Gemma 4 works really well, I use it for most tasks locally.

It's not very fast, around 15t/s, but privacy means a lot to me.

I use llamacpp-cli with a custom caveman system prompt and set the reasoning budget manually, mostly to 500 tokens.

From time to time I use duck.ai if I'm in a hurry.