

Are you looking to switch?
In general, though? Biebian: https://biebian.sourceforge.net/
Or Red Star OS: https://archiveos.org/redstar/
Depends if you are a systemd or openbox person.
Are you looking to switch?
In general, though? Biebian: https://biebian.sourceforge.net/
Or Red Star OS: https://archiveos.org/redstar/
Depends if you are a systemd or openbox person.
Yeah, that’s problematic, heh.
To be fair, I do wish more privacy-friendly browsers took DDG mobile’s approach, namely torch all sites but make it really easy to (and prompt you to) whitelist frequently used ones.
You can permanently disable the chatbot in full DDG search. Click the little gear.
It does make me wonder what API they use. I thought it was huggingface (which would be less bad), but they don’t say it explicitly.
Yeah. But it also messes stuff up from the llama.cpp baseline, and hides or doesn’t support some features/optimizations, and definitely doesn’t support the more efficient iq_k quants of ik_llama.cpp and its specialzied MoE offloading.
And that’s not even getting into the various controversies around ollama (like broken GGUFs or indications they’re going closed source in some form).
…It just depends on how much performance you want to squeeze out, and how much time you want to spend on the endeavor. Small LLMs are kinda marginal though, so IMO its important if you really want to try; otherwise one is probably better off spending a few bucks on an API that doesn’t log requests.
In case I miss your reply, assuming a 3080 + 64 GB of RAM, you want the IQ4_KSS (or IQ3_KS, for more RAM for tabs and stuff) version of this:
https://huggingface.co/ubergarm/GLM-4.5-Air-GGUF
Part of it will run on your GPU, part will live in system RAM, but ik_llama.cpp does the quantizations split and GPU offloading in a particularly efficient way for these kind of ‘MoE’ models. Follow the instructions on that page.
If you ‘only’ have 32GB RAM or less, that’s tricker, and the next question is what kind of speeds do you want. But it’s probably best to wait a few days and see how Qwen3 80B looks when it comes out. Or just go with the IQ4_K version of this: https://huggingface.co/ubergarm/Qwen3-30B-A3B-Thinking-2507-GGUF
And you don’t strickly need the hyper optimization of ik_llama.cpp for a small model like Qwen3 30B. Something easier like lm studio or the llama.cpp docker image would be fine.
Alternatively, you could try to squeeze Gemma 27B into that 11GB VRAM, but it would be tight.
How much system RAM, and what kind? DDR5?
ik doesn’t have great documentation, so it’d be a lot easier for me to just point you places, heh.
At risk of getting more technical, ik_llama.cpp has a good built in webui:
https://github.com/ikawrakow/ik_llama.cpp/
Getting more technical, its also way better than ollama. You can run models way smarter than ollama can on the same hardware.
For reference, I’m running GLM-4 (667 GB of raw weights) on a single RTX 3090/Ryzen gaming rig, at reading speed, with pretty low quantization distortion.
And if you want a ‘look this up on the internet for me’ assistant (which you need for them to be truly useful), you need another docker project as well.
…That’s just how LLM self hosting is now. It’s simply too hardware intense and ad hoc to be easy and smart and cheap. You can indeed host a small ‘default’ LLM without much tinkering, but its going to be pretty dumb, and pretty slow on ollama defaults.
Nothing is more powerful than an iPhone in all conditions
I guess my use case is just totally different.
I have zero interest in phone gaming, as (even if the App Store wasn’t so MTX-ridden) I’d rather do that on PC and just read stuff when on the go. I need performance for two things: web apps (which Android inexplicably seems to win on) and, theoretically local LLM assistants, and iPhones just dont have enough RAM to make the latter worth it.
As for disk speed, my use case is transferring media on/off. But there’s no MicroSD! And even USB transfers are bizzare on Apple.
…So all their performance is totally useless to me, I guess.
Do you like your iPhone?
I just came to iOS from a Razer Phone 2 (S9 generation), but I used jailbroken iPhones forever, until the 6 or so.
… And I don’t like it. All the gestures, everything is unintuitive, and I can’t change hardly anything! My old jailbroken 5 was more feature rich, yet somehow simple and snappy too. Every app nickels and dimes you, and I feel like I’m struggling to restrict thier access to things.
I made the mistake of getting it for older family thinking it’d be easier, and they can hardly even function.
See the 1947 US Army video: “Don’t be a Sucker”:
https://archive.org/details/DontBeaS1947
It’s always been a hypocritical ideal. Even the US Military acknowledged our xenophobic tendencies, and the constant struggle against them. And slowly doing better. That’s the point.
…But I think the radical shift of the “attention economy” is what makes it feel like the ideal is finally collapsing. The population is sucked into doomscrolling Fox News (for example) at such scale that makes this US Army video feel quaint.
If they published the same thing today, no one would even notice. There’s too much noise. And that is unprecedented.
Oh, and there are other graphics makers that could theoretically work on linux, like Imagination’s PowerVR, and some Chinese startups. Qualcomm’s already trying to push into laptops with Adreno (which has roots in AMD/ATI, hence ‘Adreno’ is an anagram for ‘Radeon’)
The problem is making a desktop-sized GPU has a massive capital cost (over $1,000,000,000, maybe even tens of billions these days) just to ‘tape out’ a single chip, much less a line, and AMD/Nvidia are just so far ahead in terms of architecture. It’s basically uneconomical to catch up without a massive geopolitical motivation like there is in China.
It’s even better than that:
They all come from Taiwan Semiconductor (TSMC).
There used to be more of a split between many fabs. Then it was TSMC/Global Foundries/Samsung Foundry. Then it was TSMC/Samsung Foundry. Now AFAIK all GPUs are TSMC, with Nvidia’s RTX 3000 series (excluding the A100) being the last Samsung chip. Even Intel fabs Arc there, as far as I know.
Hopefully Intel won’t kill Arc, as they are planning to move it back to their fabs.
Fortunately, Microsoft is too incompetent to pull this off on Windows.
They tried. See the metro app push in Windows 8+. But it’s kind of incredible how much they bungled it; even now, it would be totally dysfunctional with Win32 apps locked down.
And if Windows doesn’t do it, hardware makers aren’t really interested in that sort of thing.
Stuff like SteamOS does worry me a tiny bit. It’s obviously fine now, but I can see a future where, say, Valve (or any hardware seller with some kind of successful storefront) starts to not like rising competition on their own stuff.
Yeah, see, that makes sense. A random app and an optional account number are not reliable notification systems. They can’t just assume everyone will opt into those.
Because however one feels about blockchain tech and its future, past companies within the crypto industry are notorious for selling the moon, being shady, and cashing out early. ‘ZCash’ appears to be a good example, particularly because a small group exerts such a high level of control over it.
And if the parallel holds, and at least some of that applies Jay Gaeber’s own personal experience and expectations of what a company’s trajectory should look like, it doesn’t bode well for Bluesky.
But nothing is standard.
As an example from this last week, I tried to install something with a poetry install procedure… didn’t work. In a nutshell, apparently a bunch of stuff in poetry is ancient and doesn’t even work with this git repo anymore. Or maybe not my system? I can’t tell.
So I tried uv. Worked amazing… Until I tried to run the project. Apparently some dependency of matplotlib uses Python C libraries in a really bizzare nonstandard way, so the slight discrepency broke an import, which broke the library, which broke the whole project on startup.
So I bet the bullet, cleared a bunch of disk space and installed conda instead, the repo’s other official recipe. Didn’t freakin’ work out of the box either. I finally got it to work with some manual package version swapping, though.
And there was, of course, zero hope of doing any of this with actual pip, apparently.
At this point I wasn’t even excited to test the project anymore, and went to bed.
The character swapping really isn’t accomplishing much.
Speaking from experience, if I’m finetuning an LLM Lora or something, bigger models will ‘understand’ the character swaps anyway, just like they abstract different languages into semantic meaning. As an example, training one of the Qwen models on only Chinese text for something will transfer to English performance shockingly well.
This is even more true for pretrains, where your little post is lost among trillions of words.
If it’s a problem, I can just swap words out in the tokenizer. Or add ‘oþer’ or even individual characters to the banned strings list.
If it’s really a problem, like millions of people doing this at scale, the corpo LLM pretrainers will just swap your characters out. It’s trivial to do.
In other words, you’re making life more difficult for many humans, while having an impact on AI land that’s less than a rounding error…
I’ll give you an alternate strategy: randomly curse, or post outrageous things, heh. Be politically incorrect. Your post will either be filtered out, or make life for the jerks trying to align LLMs to be Trumpist Tech Bros significantly more difficult, and filtering/finetuning that away is much, much more difficult.
Mobile 5090 would be an underclocked, binned desktop 5080, AFAIK:
https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_50_series
In KCD2 (a fantastic CryEngine game, a great benchmark IMO), at QHD, the APU is a hair less half as fast. For instance, 39 FPS at QHD vs 84 FPS for the mobile 5090:
https://www.notebookcheck.net/Nvidia-GeForce-RTX-5090-Laptop-Benchmarks-and-Specs.934947.0.html
https://www.notebookcheck.net/AMD-Radeon-8060S-Benchmarks-and-Specs.942049.0.html
Synthetic benchmarks between the two
But these are both presumably running at high TDP (150W for the 5090). Also, the mobile 5090 is catastrophically overpriced and inevitably tied to a weaker CPU, whereas the APU is a monster of a CPU. So make of that what you will.
through phone if you have a phone on your water account, through a system no one knew existed
I interpreted this as one system. So its:
Water website, you’d have to happen to stumble upon
Obscure opt-in phone system
If that’s the case, the complaint is reasonable, as the water service is basically assuming Facebook (and word of mouth) are the only active notifications folks need.
But yeah, if OP opted out of SMS warnings or something, that’s more on them.
+1 for immutable in general.