Hacker Newsnew | past | comments | ask | show | jobs | submit | smlacy's commentslogin

Total gimmick. I guess we're "making progress", but this is will never lead to any useful application other than "Yes, you're absulotely right" bots. What's needed for real applications is 10000× the input token context and 10× the output token speed, so we're off by a factor of ... 100,000×?

Correct, also with the context growing, the conversations cannot continue at the initial speed either. Gimmick or not, this is very sci-fi compared to 10-20 years ago.

This should be the top comment

And with only like a dozen tokens of context. What happens when this thing gets the ~100k tokens of context needed to actually make it useful?

I like the analogy but which 2 is AI coding?

Fast & Cheap (but not Good?) - I wouldn't really say that AI coding is "cheap"

Cheap & Good (but not Fast) - Again, not really "cheap"

Fast & Good (but not Cheap) - This seems like maybe where we're at? Is this a bad place?


The proper idiom is "You can only pick two". It doesn't say that everything is two of them, or even one.

It's hitting all three, right _now_.

Eventually, it will be just Fast and Good. It won't be cheap, as companies start moving towards profitability.

Remember when Uber was super cheap? I do. They're fast and good though.


It's not cheap or good, it's just fast.

It's fast. It's cheap compared to employees. It's really the latter that people are upset about.

As for good. Well, how much software is really good? A lot of it is sewn together APIs and electron-like runtimes and 5,000 dependencies someone else wrote. Not exactly hand-crafted and artisanal.

I'm sure everyone here's projects are the exception, but engineering is always about meeting the design requirements. Either it does or it doesn't.


What's your concern?

Have you ever programmed with AI? It needs a lot of hand holding for even simple things sometimes. Forgets basic input, does all kinds of brain dead stuff it should know not to do.

>"good catch - thanks for pointing that out"


Can you clarify how, at all, that’s relevant to the article?

Both the curl and the SQLite project have been overburdened by AI bug reports. Unless the Google engineers take great care to review each potential bug for validity the same fate might apply here. There have been a lot of news regarding open source projects being stuffed to the brim with low effort and high cost merge requests or issues. You just don't see all the work that is caused unless you have to deal with the fallout...

This project has nothing to do with bug reports... it's an opt-in tool for reviewing proposed changes that kernel developers can decide to use (if they find it useful).

Well, if it doesn't find anything it's just a waste of time at best.

Prevention paradox.

i think it's a skill.

That background looks like AI for sure though?


Seems to be missing the Etrog? https://en.wikipedia.org/wiki/Citrus_taxonomy#Citrons

Also, the "click to show search results" is cool but fails for "Arizona Citron" in obvious ways.


There are lots of citruses missing; the ones in the chart are only the ones I could find reliable values for (from the sources at the bottom). I'll add more if I can find other reliable sources. For what it's worth, I think the etrog is basically a pure citron variety.

Yeah, that's definitely an issue. If I get a chance, I'll curate images to add!


Fascinating. Look forward to your update.


I assume the esrog is the primeval citron but I've noticed that Jewish tradition (which rejects the use of hybrid citrons) allows some surprisingly different citrons in practice, popularly associated with Israel, Morocco, Yemen, Corfu etc. These differ considerably in eg rind thickness.


Apparently it's also known as the Greek citron, but I don't see it under that name either.


I think the etrog is not a hybrid so it would overlap with the citron


I couldn't find the Seville orange, or what Iranians call Narang

Nevermind, they have the "Sevillan Sour Orange" and a few other sour oranges


Also appears to be missing Yuzu and Sudachi


Been true on most chromeos devices for a while!


Not all languages have such well-defined and commonly used delimiters. Is this "English only"?


Which language are you thinking of? Ideally, how would you identify split points in this language?

I suppose we've only tested this with languages that do have delimiters - Hindi, English, Spanish, and French

There are two ways to control the splitting point. First is through delimiters, and the second is by setting chunk size. If you're parsing a language where chunks can't be described by either of those params, then I suppose memchunk wouldn't work. I'd be curious to see what does work though!


There are certainly cases of Greek/Latin without any punctuation at all, typically in a historical context. Chinese & Japanese historically did not have any punctuation whatsoever.


Do the delimiters have to be single bytes? e.g. Japanese full stop (IDEOGRAPHIC FULL STOP) is 3 bytes in UTF-8.


No, delimiters can be multiple bytes. They have to be passed as a pattern.

// With multi-byte pattern

let metaspace = "<japanese_full_stop>".as_bytes();

let chunks: Vec<&[u8]> = chunk(text).pattern(metaspace).prefix().collect();


Yes! I'm in the same boat! My guess is that there's something in the backend (image cache/storage size) that presents as a cost/logistical problem instead of as a technical one.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: