More

syntaxing · 2026-05-07T18:12:44 1778177564

How has opencode go been for you? Worth changing over from Claude pro?

DefineOutside · 2026-05-07T18:44:27 1778179467

I've found that opencode and codex are the two subscriptions that still seem to subsize usage. Deepseek V4 has been the most powerful model in opencode IMO, I trust it with problems where I can validate the solution such as debugging an issue - but I only trust the proprietary GPT-5.5 and Claude Opus 4.7 models for writing code that matters.

amunozo · 2026-05-07T19:09:50 1778180990

Given the price, extremely satisfied, especially thanks to DeepSeek V4 Flash that makes it last forever. I use it on top of my 20$ Codex which is great but tokens last nothing.

syntaxing · 2026-05-03T19:29:10 1777836550

I’m guessing it’s just haloW without the licensing requirements.

refulgentis · 2026-05-03T20:00:54 1777838454

Gonna reply here, but this isn't about you or this post:

HN has a lot of us that have ~0 idea what you'd use this for, even when we steelman, all we can do is vaguely handwave about easier to setup wireless internet on a vast compound we own.

Would be really cool if someone could hop in and just give a couple one off examples, i guess? Only other one handwave I can think of is IOT x assembly line stuff for businesses, but I'm real curious why individuals are so into it -- or maybe they're not, and that's why the codebase quality is so poor? Idk.

nunobrito · 2026-05-03T20:48:42 1777841322

You'll read a lot of illusions and wishful intentions.

In the end: LoRa is only good for very short text messages at somewhat long distance (up to 10km without special setup) and without bad conditions (obstacles on line of sight, rain/fog). There is an ongoing fight between each of the two frequencies to be used as default and this publication adds another frequency into the battle.

There is WiFi HaLow, a relatively new WiFi protocol which seems to solve the low bandwidth issues with LoRa on relatively confortable distance (likely up to 8km, same as with LoRa in regards to Line of Sight), albeit slightly less affected by weather conditions. The advantage here is permitting to send images and binary data in general, but think about something being sent at the speed levels from 2005 (which in any case is good speed for most usable things).

Then there are other relevant mesh protocols yet to mention here like ESPnow which is my personal favorite. Whereas the other two options above are exotic and with transceivers around the 50 EUR and above. With ESPnow you just need any cheap ESP32 embedded device with an optional antenna to increase range for about 3 EUR (antenna included). With that you get similar returns to WiFi HaLow with less range (about 3 kilometers max on my experiments) but cheap like heck.

To setup internet on a vast compound, WiFi HaLow might be a good investment. If you are with a constrained budget, then ESP32 is your friend. To remember, long distance is limited so if you are considering more than 8 devices exchanging heavy data, you should just go for proper WiFi long range transmitters.

refulgentis · 2026-05-03T22:17:34 1777846654

Cheers, there's nothing more valuable than an opinionated overview from someone who groks the domain

chocrates · 2026-05-03T20:12:48 1777839168

Assuming you mean mesh in general: Meshtastic like projects

- emergency communication

- low power data transfer for sensors

- low data rate data transfer for mobile groups. Air softers use it to transmit information to each other while playing.

HaLow:

- "high" data rate over shorter range, though much higher range than 2.4 wifi - data sharing between mobile groups like above, but high enough bandwidth for low quality video

- large area wifi deployments

jakeydus · 2026-05-03T21:33:25 1777844005

I build environmental and structural sensor networks for work and this has my wheels spinning, but honestly I can’t think of many uses for the additional bandwidth. You could packet additional metadata maybe? GPS or network info? I’ll get one and play with it but off the top of the dome I think sub-Ghz is sufficient for most everything I do.

syntaxing · 2026-05-03T19:26:50 1777836410

It sucks how everything feels like a toy. I think meshtastic is the closest thing to a “product”. They made a bunch of bad architectural decisions that are haunting them now like how nodes broadcast its info.

api · 2026-05-03T19:57:32 1777838252

It doesn't surprise me. This is a deep networking problem and very few CS people know anything about networking or how to design clean, fast, low-overhead network protocols and systems.

If IP were designed today the packets would have 500+ bytes of plain text JSON as headers and the spec would support hundreds of extensions.

chocrates · 2026-05-03T20:08:09 1777838889

Is there a better designed mesh project like those two getting built that you know of? Reticulum?

pocksuppet · 2026-05-04T00:44:43 1777855483

It's a fundamentally really hard problem that looks easy on the surface. There is no solution that works well beyond the small scale. Many people have tried. It's the same kind of thing that draws people to try to write IPv8.

syntaxing · 2026-05-03T22:58:56 1777849136

Yeah, openmanet with reticulum seems the most “professional” right now

chocrates · 2026-05-03T23:18:04 1777850284

Heh nice, I have 4 openmanet nodes on HaLow right now

kay_o · 2026-05-03T22:42:45 1777848165

Have you seen that IPvwhatever proposal from a handful of weeks back that has OAuth/OIDC in packet spec

cr125rider · 2026-05-04T03:43:08 1777866188

7 OSI layers were too many. What if we ONE BOG ONE!

Gigachad · 2026-05-03T22:56:55 1777849015

Because they are toys. For real work it makes so much more sense to use the internet. With the new satellite tech you can reach the internet everywhere.

Mesh radio is a fun way to chat with radio nerds in your area. Not a serious infrastructure.

KingMachiavelli · 2026-05-04T04:23:42 1777868622

So what’s the real solution for when Starlink is too expensive and too high power? I really want to solution for remote mountaineering communication that’s not just GMRS. And what about remote weather sensors? I really don’t need a full internet connection just to send a tiny payload every 5 minutes.

Meshtastic should be the obvious answer for this but in my limited experience the app(s) and code are buggy on even the most typical hardware. Wish it wasn’t the case but it is.

rcoder · 2026-05-04T15:22:49 1777908169

How remote is "remote"?

If you're talking about a few miles/KMs between nodes, plain old LoRaWAN might be more than sufficient, esp. for the sensor use case. The nice thing about using LoRaWAN is that's it's literally providing an IPv6 overlay so you can run e.g. MQTT or a text-based messaging protocol designed for regular TCP/IP use. UDP is preferable to avoid frequent session resets and keepalive traffic chewing up your available bandwidth.

Meshtastic and MeshCore can theoretically provide "infinite" range so long as there are peers between the nodes you want to connect. Theoretically, mobile peers can also serve as store-and-forward nodes so that reachability doesn't need to be constant, just frequent enough to handle the messaging you want to do.

I would absolutely not rely on either for a safety-critical application, though. If you want emergency comms in case something happens while you're out on the mountain, use a satellite communicator. There are a ton of these marketed for outdoor/portable use, and they have much more robust "SOS" capabilities (up to and including direct dispatch of search-and-rescue).

KingMachiavelli · 2026-05-04T22:02:13 1777932133

LoRaWAN seems interesting but the documentation and availability of is either "Crypto hobby project from Seedstudio" or "Strange telecom companies selling $900 base stations that still expect an internet connection (for licensing?)". Maybe I'm missing something but the LoRaWAN doesn't see to sell itself very well when half the vendors are behind "Contact for quote" pages.

Of course, for real emergencies I have a Garmin SOS device. It would just be "nice" to have something for local 2-5 km communication that doesn't need a clear view of sky, works partially underground, etc. GMRS is "fine" but from a physics perspective a digital signal with Chirp encoding should go further and be more reliable.

Seems like JS8Call or Packet radio might more in line with what I want. It's just surprising that something like Meshtastic hasn't replaced them.

zbentley · 2026-05-05T03:11:30 1777950690

You said it yourself:

> Of course, for real emergencies I have a Garmin SOS device.

that's why the mesh radio/LoRaWAN-type ecosystems suck. I don't mean to be rude or snarky; just to point out a very contextually-relevant example against your argument.

For the average consumer who needs this functionality seriously, there's a proprietary (and often costly) solution. Subtract those mission-critical-remote-comms devices and you're left with hobbyist needs, so you get hobbyist-quality ecosystems.

rocqua · 2026-05-04T16:39:31 1777912771

Is there any implementation of the store and forward for mobile nodes?

From what I recall, meshcore de duplication only tracks like the last 256 messages so that could quickly fail to de duplicate.

wtallis · 2026-05-04T18:07:33 1777918053

Meshtastic supports store and forward for ESP32 nodes that have a few MB of RAM, but not for the nRF52 devices that can't practically buffer much. I've only used the latter class of devices, so I don't have any experience with how well Meshtastic's store and forward works in practice.

Gigachad · 2026-05-04T04:30:14 1777869014

Depends what exactly it is you want. But phones these days can communicate with satellites for emergency messaging.

I think people need to think more about what the actual scenario they have in mind is because it seems most people think of mesh radio as some backup for the government shutting the internet down. When in reality it’s almost useless for that since it’s so easy to jam or flood mesh radio.

subscribed · 2026-05-04T16:52:43 1777913563

For emergency communication? Iridium, zoleo, JS8Call, packet radio.

Not LORA.

__MatrixMan__ · 2026-05-03T23:10:17 1777849817

We may see a day when the internet is not available, or when interacting with it represents an unacceptable risk. It's a good idea to know how to set up your own.

Gigachad · 2026-05-03T23:29:28 1777850968

In that day whatever is jamming starlink will just jam mesh radio too. It'll likely be even easier.

andwur · 2026-05-04T03:53:33 1777866813

It's a different jamming scenario however. Starlink is comparatively centralised, and reliant on both terrestrial (ground stations) and satellite communication. While the terminals themselves are sparse and widely distributed, the backbone infrastructure is far less so. It's possible to target the satellites, ground stations and critical service dependencies (e.g. GPS) rather than needing to target the hundred of thousands/millions of terminals directly.

The mesh networks are dealing with, by definition, a sparse and widely distributed set of devices which are independently configured and controlled, and in their current widely available form are only dealing with terrestrial communication. Without that point of centralisation you would need to focus on targetted regional jamming, as from a practical standpoint you cannot perform wideband RF jamming over an entire country - signal jammers don't scale that well, and geographic features come into play. As an example you might effectively block mesh networks from operating reliably in a given city, but if people were to move outside of that area then the mesh would operate again. Geography is both a strength and a weakness here: a mountain range will impede direct communication with someone on the other side, but it will also have the same effect on jammers which will vastly increase the cost to deploy them in a ubiquitous fashion.

Gigachad · 2026-05-04T04:48:51 1777870131

I suspect jamming LoRa could be a lot easier than most radio though. LoRa signals are incredibly weak and long range. A jammer which jams at a massively higher power level could cover a massive area. You can also just flood the network with messages that nodes will happily relay further for you.

rcoder · 2026-05-04T15:29:32 1777908572

That's a DoS attack, not "jamming". RF jamming usually relies on flooding frequencies with garbage which doesn't get interpreted as valid protocol traffic but does "crowd out" legitimate use.

The protocol-aware class of attack you describe does require some knowledge of the radio parameters being used, since LoRa runs on very narrow bands and uses both time and frequency-hopping to avoid congestion on any one virtual channel. They even apply (very basic) encryption to messages to prevent unknown senders from flooding the channel.

Unfortunately, both systems come preconfigured out of the box to use a default configuration which most users never override. So like cheap FRS/GMRS walkie talkies, all it takes is a few jerks who don't care about common use to overwhelm everyone with bogus messages. If you fire up a new device running the default Meshtastic firmware in any kind of dense urban environment, odds are it will more or less immediately get inundated with spam: "ping", "test", "hello from <neighborhood>", etc.

And since MT + MC both flood the shared channels to push messages across intermediary nodes, they pretty much self-DDoS by doing...nothing.

api · 2026-05-04T01:32:07 1777858327

That’s really the killer for survivalist mesh ideas. It’s trivially easy to jam, and if it’s open it’s also easy to DDOS.

Jamming is done in military scenarios too, but in that case it’s limited by the fact that a jammer is a big transmitter painting itself with a big sign that says “fire missile here.” Civilian mesh doesn’t have that fallback.

nostrademons · 2026-05-04T04:12:31 1777867951

Neglect is a bigger killer than active denial. If the Internet goes down it will likely be because a few execs decided to replace competent network admins with AI, or because all the competent network admins decided to quiet-quit because they aren't being paid jack compared to the folks hawking AI vaporware.

samplifier · 2026-05-04T03:55:19 1777866919

Battlestar Galactica opened my eyes to this problem more than electronic warfare in games of the day did. It's freaky (read: terrifying) that we're getting to a point that people are starting to take "embedded information (and decision)" systems serious enough to deploy them into meat space.

__MatrixMan__ · 2026-05-06T23:03:48 1778108628

Probably not short range connections. The application layer will have to change but we can still have an internet that operates when we pass each other on the street or share an elevator--the primary bandwidth carrier being devices being physically moved through space, and cross-device chatter being opportunistic.

Also, it might not be jamming. It might be that whoever is operating the satellites at the time denies access unless you enable inspection, and then sells that info to somebody who would hurt you--or whatever other can't-trust-the-middleman dystopia you care to imagine.

arashThr · 2026-05-05T10:36:38 1777977398

True. But look at the situation in Iran. As much as internet seems like an essential part of daily life, there is the possibility for the governments to shut it down.

nubinetwork · 2026-05-04T01:17:43 1777857463

> not a serious infrastructure

I've been tinkering with the tech to make city-wide flrc meshes joined together over the internet, my estimates are that it should be at least able to support thousands of users per region.

Gigachad · 2026-05-04T04:33:55 1777869235

This has been tried with mqtt bridges in Meshtastic. But it’s ultimately kind of pointless because if you are planning some kind of internet alternative, you don’t want to build something that falls over the moment the internet goes down.

nubinetwork · 2026-05-04T04:39:34 1777869574

I know, I'm not too worried that I can't reach Billy in Ottawa, but you should still be able to text your mother six blocks away. /shrug

Gigachad · 2026-05-04T05:21:01 1777872061

That works with just basic mesh radio. The internet bridges thing is tempting but ultimately a bit useless and doesn't push people to extend the mesh natively.

nubinetwork · 2026-05-04T09:43:12 1777887792

Don't get me wrong, I like the mesh/* ideas around everyone being able to prop up a router/repeater, but I've seen what that can do in an urban environment... unfortunately for some, I don't plan on letting every tom dick and harry to set up their own towers.

mschuster91 · 2026-05-03T21:09:46 1777842586

Usability wise Meshcore is better due to static routing and enabling (far) longer paths.

syntaxing · 2026-05-03T19:21:56 1777836116

I know it’s all open source and I’m not paying for anything so I cant be choosy. But after playing with a bunch of Lora peer to peer chat systems. All I wish is a chat service that uses haloW. Since it uses wifi backend, regular wifi should work as well.

syntaxing · 2026-04-30T20:22:05 1777580525

Dad and millennial here and this change has been very noticeable in my circle of friends including myself and I’m all for it. Men have been doing their share of housework too. But I will say, it’s not all dads but enough that I think this will have a positive effect on the next generation.

justonceokay · 2026-04-30T20:51:45 1777582305

Im gay and because of that was disowned. My partner has a brother “K” and K has three children. Watching K show up in basic ways for his kids, like remembering what songs they like and teaching them sports is the fastest way to make me ugly cry.

Thanks to anyone reading this if you’re trying to be a good dad. You’re making the world a better place in ways you don’t even see

pchristensen · 2026-05-01T15:38:37 1777649917

This is the rare time I wish HN had emoji reactions instead of just upvotes.

mekdoonggi · 2026-05-01T14:35:48 1777646148

I can honestly say that I don't have any time for a dad who isn't all-in for their kids. I understand if the responsibilities aren't 50/50, but if you're making mom handle everything I think you're a loser.

All my millennial dad friends clean, change diapers, cook, whatever. And make no mistake all the moms are incredibly hard-working and involved with the kids.

If I happened to meet socially a dad who wasn't doing those things I would literally make fun of them. "You're a grown man who can't change a diaper or clean a bathroom?"

grvdrm · 2026-05-01T16:12:28 1777651948

I’m with you mostly. Some different specifics but the point in mind is this: it’s a common thread of rapport and conversation. I sometimes feel like an alien on earth when I spend time with friends or other groups where there seems to be a atrong “ughh my family and home life” vibe.

mekdoonggi · 2026-05-01T17:15:37 1777655737

I said hello to another dad at soccer for three-year-olds, and he responded with something like, "Ugh, I'd rather be ANYWHERE else".

It's 10am on a Saturday and you're running around playing games with your kid. I just stared at him and went on.

JuniperMesos · 2026-05-01T23:41:45 1777678905

I was strongly encouraged by my own parents, particularly my dad, to play sports (baseball, a bit of basketball) as a kid; even though I wasn't very good at them and wasn't very interested in them (and got made fun of by other kids for this). At some point I realized that me playing sports was something my dad was more invested in than I was. When I was 11 or so, I finally decided that I had had enough, and quit the neighborhood little league baseball team I was on in the middle of the season; I suspect the team was happy to have me gone, and I was happy that trying to play baseball was no longer my problem. Suffice to say, I have no happy memories of playing catch with with my dad at any time in my life.

My younger siblings were a bit more intrinsically interested in sports than I was, and my parents shifted their attention to their sports extracurriculars. I actually don't really remember what they did sports-wise because I did not care at all; and although I was the older sibling I was not so much older that anyone thought it was important to encourage me to take a pseudo-parental or caretaker interest in what my younger siblings were doing. I would go to the baseball field where one brother played his games because my parents were going, and then amuse myself by playing alone in the dirt beyond the bleachers, because that was more fun than paying attention to the game. By the time I was old enough to, say, drive them places in lieu of our mom, they had gotten to the age where sports were meaningfully competitive and were not actually good enough to keep playing.

So not only do I find this dad's attitude extremely sympathetic, I think that I would've found it sympathetic even when I myself was a child. This makes me some kind of outlier, I'm sure. Anyway, 3 years is young enough that there's no actual soccer happening, just running around with a ball, any kid can enjoy that. It's quite possible that, depending on the interests and dispositions of his kid, that dad won't be compelled to be on a soccer field at 10am much further in the future.

grvdrm · 2026-05-01T17:28:19 1777656499

Exactly.

My older daughter is on a competitive cheerleading team. Not something we (parents) suggested but instead she found through school friends. She loves it. Has boosted her confidence and athletic prowess.

There aren't many dads at the meets relative to moms. Not remotely surprising. I'm the first person to admit that I don't know how to do hair or make up.

I see quite a divergence among the men in commentary. Some are there and happy their kids are loving it - they're finding a way to make peace with the situation. Some are checked out, on phones, looking grumpy at best.

Some part of me gets it. Wild asymmetry in that sport. Performances are just a few minutes long, but there's a shit-ton of practice and weekend days/entire weekends dedicated to cheer.

It would be so so so easy to say "get me out of here" but I've found a way to enjoy and make peace and make a friend or two along the way.

Contrast with her other current sport: lacrosse. First season and it's kind of a shit-show. But I'm with her in the sun on a Friday night - and with the right weather - it is a great place to be. We (parents, dads, etc.) see our friends there too.

syntaxing · 2026-04-30T03:17:47 1777519067

I always wondered if Justin Kan’s Atrium closed door prematurely by just 2-3 years. It would have been cool to see a “technology” driven law firm and how it would have adjusted to LLMs.

alansaber · 2026-04-30T10:13:10 1777543990

There are loads of them now. Great for trivial work. Not so great to highly templatise more complex matters.

syntaxing · 2026-04-29T17:00:20 1777482020

This is a very interesting strategy that might pay off. This model is a very good option for enterprise self host. I would argue a lot of companies are VRAM constrained rather than compute constrained. You could fit 4-5 running instances on one H100 cluster where you can only fit 1-2 Kimi K2 or GLM5.

2001zhaozhao · 2026-04-29T17:47:50 1777484870

This is 128B dense though. the K/V cache on long context is going to be massive

Havoc · 2026-04-29T18:30:49 1777487449

Don’t think kv size correlates to dense/moe

zozbot234 · 2026-04-29T18:46:47 1777488407

KV size correlates with attention parameters which are a subset of active parameters. So a typical MoE model will have way lower KV size than a dense model of equal total parameter count.

syntaxing · 2026-04-29T19:22:03 1777490523

With turbo quant, you would reduce it by over 6X.

syntaxing · 2026-04-29T03:00:37 1777431637

Is RAG dead? I would be very surprised a local small SOTA embedded model like llama-embed-nemotron-8b doesnt outperform the Haiku layer for this application. Should be pretty cheap and easy to prove out. With 32K context size, you can literally one shot the whole ticket.

preommr · 2026-04-29T03:16:28 1777432588

Yea, but RAG takes effort. At the very least some kind of system to organize the documents and do the retrieval.

My theory is that the AI frenzy has reached new levels of insane, where it's literally just throw anything and everything at the model, and just burn tokens to let the AI figure everything out. Why bother paying the upfront cost for a RAG, when the models/agents are constantly evolving, so just slap in a markdown file telling it to check a folder, and call it a day.

Like in design world, people are doing minor tweaks like changing the spacing by typing in prompts instead of just changing a number in an input field. We are legitimately approaching just using llms instead of calculators, or memes like that endpoint that calls an llm to generate the code to do some business logic, rather than directly code the logic.

shad42 · 2026-04-29T03:32:01 1777433521

IMO RAG is mostly dead. The game changer with newer models like Opus is the reasoning. So instead of pushing all the context up front (RAG style), it's better to give strong primitives (eg. bash, SQL) and let the agent figure it out.

It's what Claude Code is doing now and the principles we applied for Mendral as well.

That said, you're right that some smaller models can outperform Haiku and we're thinking supporting oss models at some point. But it does not change the core design principles IMO.

Salgat · 2026-04-29T16:25:53 1777479953

It's more accurate to say that RAG is alive and well and is just incorporated into the agent's responsibility, it's just one more tool that it can call on instead of the user manually doing it.

syntaxing · 2026-04-22T16:45:26 1776876326

Been using Qwen 3.6 35B and Gemma 4 26B on my M4 MBP, and while it’s no Opus, it does 95% of what I need which is already crazy since everything runs fully local.

FuckButtons · 2026-04-22T17:47:58 1776880078

It’s good enough that I’ve been having codex automate itself out of a job by delegating more and more to it.

Very excited for the 122b version as the throughput is significantly better for that vs the dense 27b on my m4.

Someone1234 · 2026-04-22T19:07:47 1776884867

You've got me curious. Two questions if I may:

- What kind of tasks/work?

- How is either Qwen/Gemma wired up (e.g. which harness/how are they accessed)?

Or to phase another way; what does your workflow/software stack look like?

syntaxing · 2026-04-22T19:23:24 1776885804

1. Qwen is mostly coding related through Opencode. I have been thinking about using pi agent and see if that works better for general use case. The usefulness of *claw has been limited for me. Gemma is through the chat interface with lmstudio. I use it for pretty much everything general purpose. Help me correct my grammar, read documents (lmstudio has a built in RAG tool), and vision capabilities (mentioned below, journal pictures to markdown).

2. Lmstudio on my MacBook mainly. You can turn on an OpenAI API compatible endpoint in the settings. Lmstudio also has a headless server called lms. Personally, I find it way better than Ollama since lmstudio uses llama cpp as the backend. With an OpenAI API compatible endpoint, you can use any tool/agent that supports openAI. Lmstudio/lms is Linux compatible too so you can run it on a strix halo desktop and the like.

ycombinatornews · 2026-04-23T02:53:06 1776912786

Curious how do you run opencode and qwen locally? Few times I tried it responds back with some nonsense. Chat, say, through ollama works well.

syntaxing · 2026-04-23T12:11:12 1776946272

Which quants are you using? I had similar issue until I used Unsloth’s. I would recommend at least UD_6. Also, make sure your context length is above 65K.

https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF

Someone1234 · 2026-04-22T20:45:33 1776890733

Thanks I appreciate the info. I may try to spin up something like this and give it a whirl.

anon373839 · 2026-04-22T22:47:36 1776898056

I would recommend trying oMLX, which is much more performant and efficient than LM Studio. It has block-level KV context caching that makes long chats and agentic/tool calling scenarios MUCH faster.

felikz · 2026-05-04T21:40:57 1777930857

and it horribly kernel panics when it is running for too long due to Apple does not give a sh over mlx, see list of issues: https://github.com/Harperbot/metal-guard#landed-here-searchi...

throwaw12 · 2026-04-22T16:50:57 1776876657

can you expand more on what you mean by 95%?

There are 2 aspects I am interested in:

1. accuracy - is it 95% accuracy of Opus in terms of output quality (4.5 or 4.6)?

2. capability-wise - 95% accuracy when calling your tools and perform agentic work compared to Opus - e.g. trip planning?

syntaxing · 2026-04-22T17:11:19 1776877879

1. What do you mean by accuracy? Like the facts and information? If so, I use a Wikipedia/kiwx MCP server. Or do you mean tool call accuracy?

2. 3.6 is noticeably better than 3.5 for agentic uses (I have yet to use the dense model). The downside is that there’s so little personality, you’ll find more entertainment talking to a wall. Anything for creative use like writing or talking, I use Gemma 4. I also use Gemma 4 as a “chat” bot only, no agents. One amazing thing about the Gemma models is the vision capabilities. I was able to pipe in some handwritten notes and it converted into markdown flawlessly. But my handwriting is much better than the typical engineer’s chicken scratch.

throwaw12 · 2026-04-22T17:15:03 1776878103

by accuracy I meant how close is the output to your expectations, for example if you ask 8B model to write C compiler in C, it outputs theory of how to write compiler and writes pseudocode in Python. Which is off by 2 measures: (1) I haven't asked for theory (2) I haven't asked to write it in Python.

Or if you want to put it differently, if your prompt is super clear about the actions you want it to do, is it following it exactly as you said or going off the rails occasionally

syntaxing · 2026-04-22T17:24:07 1776878647

Ironically, even though I write C/++ for a living, I don’t use it for personal projects so I can’t say how well it works for low level coding. Python works great but there’s a limit on context size (I just don’t have enough RAM, and I do not like quantizing my kv cache). Realistically, I can fit 128K max but I aim for 65K before compacting. With Unsloth’s Opencode templating, I haven’t had any major issues but I haven’t done anything intense with it as of late. But overall, I have not had to stop it from an endless loop which happened often on 3.5.

physicles · 2026-04-22T22:35:37 1776897337

I have a Supernote and was looking at different models for handwriting recognition, and I agree that gemma4-26B is the best I’ve tried so far (better than a qwen3-vl-8B and GLM-OCR). Besides turning off thinking, does your setup have any special sauce?

syntaxing · 2026-04-23T00:53:27 1776905607

Q8 or Q6_UD with no KV cache quantization. I swear it matters even more with small activated parameters MOE model despite the minimal KL divergence drop

richstokes · 2026-04-22T20:14:20 1776888860

Do you use it with ollama? Or something else?

syntaxing · 2026-04-22T22:04:00 1776895440

Llama cpp is vastly superior. There was this huge bug that prevented me from using a model in ollama and it took them four months for a “vendor sync” (what they call it) which was just updating ggml which is the underpinning library used by llama cpp (same org makes both). lmstudio/lms is essentially Ollama but with llama cpp as backend. I recommend trying lmstudio since it’s the lowest friction to start

syntaxing · 2026-04-22T16:42:01 1776876121

Yes and no. Are you using open router or local? Are the models are good as Opus? No. But 99% of the time, local models are terrible because of user errors. Especially true for MoE, even though the perplexity only drops minimal for Q4 and q4_0 for the KV cache, the models get noticeably worse.

acidtechno303 · 2026-04-22T16:59:54 1776877194

Sounds like you're accusing a professional of holding their tool incorrectly. Not impossible, but not likely either.

syntaxing · 2026-04-22T17:16:02 1776878162

Inferencing is straight up hard. I’m not accusing them of anything. There’s a crap ton of variables that can go into running a local model. No one runs them at native FP8/FP16 because we cannot afford to. Sometimes llama cpp implementation has a bug (happens all the time). Sometimes the template is wrong. Sometimes the user forgot to expand the context length to above the 4096 default. Sometimes they use quantization that nerfs the model. You get the point. The biggest downside of local LLMs is that it’s hard to get right. It’s such a big problem, Kimi just rolled out a new tool so vendors can be qualified. Even on openrouter, one vendor can be half the “performance” of the other.