More

czardoz · 2026-02-21T23:19:48 1771715988

Really looking for a minimal assistant that works with _locally hosted models_. Are there any options?

Onavo · 2026-02-21T23:22:34 1771716154

The bottleneck here is usually the locally hosted model, not the the assistant harness. You can take any off the shelf assistant and point the model URL at localhost, but if your local model doesn't have enough post training and fine tuning on agentic data, then it will not work. The AI Assistant/OpenClaw is just calling APIs in a for loop hooked up to a cron job.

czardoz · 2026-02-21T23:25:16 1771716316

Exactly. OpenClaw is good, but expects the model to behave in a certain way, and I've found that the local options aren't smart enough to keep up.

That being said, my gut says that it should be possible to go quite far with a harness that assumes the model might not be quite good (and hence double-checks, retries, etc)

godelski · 2026-02-21T23:29:07 1771716547

Depends what you mean.

If you mean something that calls a model that you yourself host, then it's just a matter of making the call to the model which can be done in a million different ways.

If instead you mean running that model on the same device as claw, well... that ain't happening on an ESP32...

I think if you are capable of setting up and running a locally hosted model then I'd guess the first option needs no explanation. But if you're in the second case I'd warn you that your eyes are bigger than your mouth and you're going to get yourself into trouble.

telescopeh · 2026-02-21T23:25:31 1771716331

It really depends on what resources you have qwen-code-next will run them but you will need at least 64gb of memory to run it at a reasonable quant and context.

Most of these agents support OpenAI/anthropic compatible endpoints.

0xbadcafebee · 2026-02-22T03:23:35 1771730615

All the assistants work with locally hosted models. Home Assistant LLM works with small tuned models to do specific things, and the *Claw stuff works with larger models

yoyohello13 · 2026-02-21T23:35:45 1771716945

Why are you looking? Just build one for yourself.

czardoz · on July 6, 2023

I worked on Cinder, and on the web server. Happy to answer technical questions if any :)

DirkH · on July 7, 2023

Thank you for offering to answer questions.

Is Cinder something that could help optimize real-time streaming? We had a UDP stream and then through multiple gstreamer and nvidia deepstream magic (which I believe the senior dev implemented in Python) we perform some ML inference on the stream in real-time.

However, latency is a major issue here and to get to our MVP we didn't really prioritize optimization, as is tradition.

So now I'm wondering if Cinder as something that can be used to optimize real-time data streaming is a thing or whether me asking this just shows I don't understand its use case.

Either way, thank you advance for your insight.

(Also we used Django which I am now wondering if I should have switched out for FastAPI, but that's a separate question)

ntonozzi · on July 6, 2023

What are the most impactful and likely features that you'd like to see merged into CPython?

czardoz · on July 6, 2023

I'd love to see lazy imports become part of the upstream at some point.

feydaykyn · on July 6, 2023

Hi,

Would you recommend using Cinder for stack made of - Django - Cython - Numpy

Thanks!

czardoz · on July 6, 2023

If this stack is heavily skewed towards numerical computations, Cinder might require a lot of tuning to be effective.

vmsp · on July 6, 2023

Can you expand on the web server? Does it implement WSGI?

czardoz · on July 6, 2023

It has some elements of WSGI, but has expanded into its own thing which supports asyncio (this happened before ASGI was a thing).

whalesalad · on July 6, 2023

Who would benefit from Cinder? What kinds of workloads? The README is kinda lacking on this.

czardoz · on July 6, 2023

Cinder's feature set is highly optimized for IO bound web services that run under a forked-worker model.

For example: you start a main process, warm it up with a few requests, run the JIT compiler and then fork off worker processes to handle the main chunk of traffic.

As of now, it requires hand-tuning to get the best possible performance.

In terms of use cases, Cinder does the best when faced with "business logic" code (lots of inheritance, attribute lookups, method calls, etc). It can speed up numerical computations too, but you're probably better off using a library if that's the majority of the workload.

czardoz · on Jan 25, 2023

I'm always surprised when patterns like this actually work. But they do!

czardoz · on April 22, 2022

This article captures a lot of the rationale behind why we're building Static Python within Cinder: https://github.com/facebookincubator/cinder#static-python

czardoz · on Oct 7, 2021

One minor clarification here: Cinder[1] is actually 100% compatible with C extensions (by virtue of being a fork of CPython).

[1]: https://github.com/facebookincubator/cinder

tekknolagi · on Oct 7, 2021

Guido appears to have meant to refer to Skybison (https://github.com/facebookexperimental/skybison), which is not 100% compatible.

czardoz · on Sept 30, 2021

I've moved to Zoho mail. It's $1/month for a single user, which seems to be well worth it. The catch is, it's not possible to _send_ from Zoho without creating additional users (which has a cost).

czardoz · on May 5, 2021

Unlike RPython, Static Python in cinder is not really a subset of Python, it can compile everything (although it will throw compile time errors if it sees mismatched types). If it cannot determine type information, it just assumes the type could be anything, and falls back to slower CPython behavior.

czardoz · on May 5, 2021

> So much stuff just from the readme would introduce breaking changes to the Python ecosystem.

Being compatible with the rest of the Python ecosystem is the main reason why Cinder is built on top of CPython. Although yes, some features are indeed very experimental.

> in a world where we have type annotations, JITs feel like a massive step back. Stuff like mypyc could get us way further into high performance stuff

Ah, but that introduces a separate compilation step, which may not be tolerable in every situation.

_ZeD_ · on May 5, 2021

Well, the compilation step is alredy present in current py -> pyc phase, it's just a matter of "extending" it. Also, look at how cython work

czardoz · on May 5, 2021

> Well, the compilation step is alredy present in current py -> pyc phase

Yes, but developers don't have to ever interact with it.

> Also, look at how cython work

Cython works by adding a separate build step. Changing a Cython module requires you to recompile it, which is avoided with a JIT.

throwaway894345 · on May 5, 2021

Why would developers have to interact with a mypyc step any more than the pyc step? Why is “developers might have to interact with it” some kind of non-starter, as though having a compile phase is a worse evil than a hyper-slow language?

FWIW, I think we could probably buy ourselves a lot of latitude to optimize CPython by designating a much smaller API surface (like h.py) and then optimizations largely won’t have to worry about breaking compatibility with C-extensions (which seems to be the biggest reason CPython is unoptimized).

But in general I’ve lost faith in the maintainers’ leadership to drive through this kind of change (or similarly, to fix package management), so I’ve moved on to greener pastures (Go for the most part, with some Rust here and there) and everything is just so easy nowadays compared to my ~15 years as a Python developer.

czardoz · on May 5, 2021

> Why is “developers might have to interact with it” some kind of non-starter, as though having a compile phase is a worse evil than a hyper-slow language?

For big monoliths (like ours at IG), the server start-up can take more than 10sec, which is already super high for a "edit -> refresh" workflow. Introducing a Cython like compilation step is really a major drawback for every single developer.

For smaller projects, Cython works extremely well (and we do use it for places where we need to interface with C/C++).

throwaway894345 · on May 5, 2021

> For big monoliths (like ours at IG), the server start-up can take more than 10sec, which is already super high for a "edit -> refresh" workflow. Introducing a Cython like compilation step is really a major drawback for every single developer.

Then skip it for your dev builds.

czardoz · on May 5, 2021

Can you elaborate on what you mean by "skip" Cython compilation on dev builds? How would you then test changes to Cython code?

throwaway894345 · on May 5, 2021

So we weren’t talking about Cython specifically, but something Cython-like, i.e., we’re not talking about Cython’s special syntax but rather ordinary Python. This is important because it means that dev builds execute against CPython directly (i.e., your code begins executing immediately) while production builds use our hypothetical AOT compiler.

czardoz · on May 4, 2021

Here's some good documentation about v8's JIT: https://github.com/thlorenz/v8-perf/blob/master/compiler.md

Note: Never worked on v8, just liked the information here.

czardoz · on May 4, 2021

Yes, Static Python especially relies heavily on strict modules, since they enable us to perform module-local analysis, which enables some cool optimizations.