Hacker Newsnew | past | comments | ask | show | jobs | submit | bigEnotation's commentslogin

Reasoning models are just wrappers over the base model. It was pretty obvious it wasn’t actually reasoning but rather just refining the results using some kind of reasoning like heuristic. At least that’s what I assumed when they were released and you couldn’t modify the system prompt.


I don't understand why this comes as a surprise to a lot of people. Underlying all this is sequences of text tokens converted to semantic vectors, with some positional encoding, and then run through matrix multiplications to compute the probability of the next token. The probability is a function of all the text corpuses the model has previously consumed. You can run this process multiple times, chaining one output into the next one's input, but reason as humans know is unlikely to emerge.


Not just wrappers. Some models are fine-tuned with reasoning traces.



I wonder what the break even point would be to just have switched to use JVM.


Stripe tried switching lots of code to the JVM and it was a huge disaster. Unless you already have lots of small services with clean interfaces and their own data stores that can be incrementally switched, you spend far more time bending over backwards to keep two systems running (one with some limited parity with the other) in tandem.

The cost of running the system is almost never bottlenecked by the performance of the language itself, but rather the responsibilities of the system and cleanliness of the code. Plus, in the migration, you start incurring the cost of your Ruby code calling your JVM code (and vise versa; your systems are almost certainly not a DAG), which almost certainly has higher overhead than whatever speedup you'd get from running code in the JVM in the first place. And then you're sharing protobufs/thrift files/whatever between different languages and libraries of varying degrees of quality (good luck with those Ruby protobufs!).

Before you know it, writing a little tool to optimize your Ruby garbage collector sounds like a really great idea.


Stripe and Shopify are such polar opposites. Stripes profit per web request is super high. Shopify has to monetize infrastructure efficiency.


The problem is the "just switching" typically means a full rewrite and has a ton of logistic challenges. Are you gonna higher an entire second team? What happens to feature work on the current system? If you keep going full-throttle on features, you'll never catch up. If you stop developing features for a prolonged time, you are putting your entire business at risk.

Writing new services in Java might help, but still doesn't solve the main issue if your Rails app is a monolith. Breaking up the monolith might be a bad decision and is also super expensive and might even eat lots of the potential performance gains.


Rails on JRuby was a tenable proposition last I looked. Been a while though.


I believe TruffleRuby <https://github.com/oracle/truffleruby> is the state of the art for Ruby on the JVM, and <https://news.ycombinator.com/item?id=33503622> says that Mastodon works on it, so that's one data point. I haven't worked up the emotional energy to try to get GitLab to run on it


I was in a large org that did this for a while. It was extremely painful to be this far off the beaten path.

When you stay in the well trod C Ruby path, you benefit from the hordes of others who cleared the landlines before you. With JRuby, not so.


JRuby is far behind CRuby, so there are a lot of gems that you can't use, also there are not too many companies investing in JRuby itself, IMHO CRuby it's a safer bet.


Still around but it's significantly slower than stock Ruby.


To save VM instance costs (not disks)? Probably never.


How could you be a human being and not be small D disgruntled?


I'm pretty gruntled myself.


You’re agreeing with me! I said 4/10! Moderate disgruntlement.


I know it always gets shutdown, but I’m comfortable with it being done by the US/CIA[1], especially since Satoshi stopped sending emails once Gavin Andresen met with CIA and « took over » the project.

I anticipate it’ll be unclassified at some point, and then really take off, but not before the US exhausts the ability to try and trace malicious transactions they might be focused on. Also the US owning a bunch of the original bitcoins is in line with owning a bunch of the lower IPv4 addresses which haven’t seem much, if any, activity.

[1] https://www.reddit.com/r/CryptoCurrency/comments/mr780k/sato...


Yeah, see GPT 3.5 vs GPT 4 pricing.


I thought this is what GPT 4 was, it uses a boosting algorithm over GPT 3.5?


I think you’re forgetting about the use case where the LLM returns something partially correct to a discerning expert, who is still able to use the response, but does not bother with a message like “btw I had to do X to make your suggestions usable”.


Does go have something similar to spring, or is there a framework baked into the language


You can write service structs which depend on service structs similar to spring, and there are libraries which can generate the code that does the wiring for you.

I've always wired my stuff manually in Go though, which is useful if I want to optionally load different subsystems because of some config, etc.


I’ve not had any success to make responses deterministic with these settings. I’m even beginning to suspect historic conversations via API are used to influence future responses, so I’m not sure if it’ll truly be possible.


The most success I’ve had for classifying purposes so far is using function calling and a hack-solution of making a new object for each data point you want to classify for the schema open AI wants. Then an inner prop that is static to place the value. Then within the description of that object is just a generic “choose from these values only: {CATEGORIES}”. Placing your value choices in all capital letters seems lock it in to the LLM that it should not deviate outside those choices.

For my purposes it seems to do quite well but at the cost of token inputs to classify single elements in a screenplay where I’m trying to identify the difference between various elements in a scene and a script. I’m sending the whole scene text with the extracted elements (which have been extracted by regex already due to the existing structure but not classed yet) and asking to classify each element based on a few categories. But then there becomes another question of accuracy.

For sentence or paragraph analysis that might look like the ugliest, and horrendous looking “{blockOfText}” = {type: object, properties: {sentimentAnalysis: {type: string, description: “only choose from {CATEGORIES}”}}. Which is unfortunately not the best looking way but it works.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: