Total gimmick. I guess we're "making progress", but this is will never lead to any useful application other than "Yes, you're absulotely right" bots. What's needed for real applications is 10000× the input token context and 10× the output token speed, so we're off by a factor of ... 100,000×?
Correct, also with the context growing, the conversations cannot continue at the initial speed either. Gimmick or not, this is very sci-fi compared to 10-20 years ago.
It's fast. It's cheap compared to employees. It's really the latter that people are upset about.
As for good. Well, how much software is really good? A lot of it is sewn together APIs and electron-like runtimes and 5,000 dependencies someone else wrote. Not exactly hand-crafted and artisanal.
I'm sure everyone here's projects are the exception, but engineering is always about meeting the design requirements. Either it does or it doesn't.
Have you ever programmed with AI? It needs a lot of hand holding for even simple things sometimes. Forgets basic input, does all kinds of brain dead stuff it should know not to do.
Both the curl and the SQLite project have been overburdened by AI bug reports.
Unless the Google engineers take great care to review each potential bug for validity the same fate might apply here. There have been a lot of news regarding open source projects being stuffed to the brim with low effort and high cost merge requests or issues.
You just don't see all the work that is caused unless you have to deal with the fallout...
This project has nothing to do with bug reports... it's an opt-in tool for reviewing proposed changes that kernel developers can decide to use (if they find it useful).
There are lots of citruses missing; the ones in the chart are only the ones I could find reliable values for (from the sources at the bottom). I'll add more if I can find other reliable sources. For what it's worth, I think the etrog is basically a pure citron variety.
Yeah, that's definitely an issue. If I get a chance, I'll curate images to add!
I assume the esrog is the primeval citron but I've noticed that Jewish tradition (which rejects the use of hybrid citrons) allows some surprisingly different citrons in practice, popularly associated with Israel, Morocco, Yemen, Corfu etc. These differ considerably in eg rind thickness.
Which language are you thinking of? Ideally, how would you identify split points in this language?
I suppose we've only tested this with languages that do have delimiters - Hindi, English, Spanish, and French
There are two ways to control the splitting point. First is through delimiters, and the second is by setting chunk size. If you're parsing a language where chunks can't be described by either of those params, then I suppose memchunk wouldn't work. I'd be curious to see what does work though!
There are certainly cases of Greek/Latin without any punctuation at all, typically in a historical context. Chinese & Japanese historically did not have any punctuation whatsoever.
Yes! I'm in the same boat! My guess is that there's something in the backend (image cache/storage size) that presents as a cost/logistical problem instead of as a technical one.
reply