The fault tolerance is mostly focused on background radiation flipping bits. We've got half a century of data on the frequency of those upsets and the extent to which they're correlated under different space conditions for that, not to mention the ability to irradiate prototypes of the flight computer with representative amounts of shielding in ground based facilities...
Can't wait for the LinkedIn posts about their day to start even earlier than the 4am workout and 5am meditation with strategic dreaming between 1am and 3am.
Sounds like something Magnus Carlsen might say. I hear he's doing quite well out of the game of chess, and pointedly not playing how a computer would play, even though Deep Blue is clearly capable of winning more than he is and from more difficult positions.
Also, the world isn't as trivially solved by computation as a game of chess, so maybe delegating your job or how to be a better human to ChatGPT isn't as much of a winning strategy as getting the computer to suggest chess moves.
They investigated an open source application specifically advertising carb counting capabilities, replicated its prompts and API calls in a way optimised to collect data from 26000 queries (which is a lot to do using a GUI!). They also note other people have already done [necessarily] smaller scale studies of the commercial AI carb counting apps and been similarly unimpressed by the responses.
This is all in the first few paragraphs of a preprint paper describing the research in considerably more detail which is linked at the bottom of TFA
Meta: enjoying nearly half this HN thread being arguments that surely people care about what's in their food don't ask ChatGPT for comment instead of looking it up properly, and most of the rest of it being people who apparently care what's in a research paper asking HN for comment instead of looking it up :)
The commercial services likely also have frontier model dependencies...
The opening to the actual paper is quite explicit that (i) other studies have already tested commercial apps with with unimpressive results and (ii) a popular open source app for carb counting directly relies on API calls from these frontier models, and this research batch tested the images used the exact same models and prompts as the popular open source app.
A carb counting app might use API calls to these frontier models and then do some kind of analysis. It could see if different models agree or not, or multiple calls, and with how much variance.
So it would be more accurate to test the apps rather than the APIs, unless the goal is to warn people that just open chatgpt and ask there.
The open source app could in theory do that, but the paper's authors would be able to determine whether it did or not by reading its code, which they evidently did to replicate the API calls it made with their own script.
(And of course it would also be far more tedious to submit each picture 500 times manually using an app and manually log the response than using a script which is designed to collect the data automatically as fast as API rate limits permit)
Is your roadmap more "Satsearch for broader range of spacecraft mass classes" or "generate initial mission and spacecraft architecture from simple input parameters"?
Thanks! The goal is for it to be primarily an architecture tool, although right now it leans more towards SatSearch as I think that will get eyes faster and because there needs to be a decently robust component library for the architecture tools to work. The idea is that you can set your primary mission parameters and CONOPs, and then to get immediate feedback on spacecraft performance metrics as you trade/change hardware in the Master Equipment List. Maybe the best way to think of it is as "PCPartPicker for spacecraft".
For example if you add/change a radio transmitter, you can see in real-time how that changes your system mass, power, and link margins, or be alerted if your flight computer doesn't support your radio's data interface.
This (hopefully) lets a spacecraft systems engineer iterate through trades more quickly, track performance and margin evolution over a program lifecycle, or quickly develop a baseline for a given mission class.
Definitely quite a bit of work to go to get there but feel free to create an account and poke around and break things.
yeah, always struck me as odd that Thiel is more obsessed with identifying candidate antichrists than almost anyone else on the planet, including some people who are actually observant Christians, and yet it doesn't seem to have occurred to him that the most messianic secular figures who treat themselves as above mere laws and the guys making millenarian prophesies about the scale of what they're going to deliver are basically the guys in his rolodex...
It's just the same boring dynamic whereby every accusation is a confession. Come out swinging, and then the obvious parallels between the antichrist and Trump or even Thiel himself fall flat. Basically "no yuo"
reply