Hacker Newsnew | past | comments | ask | show | jobs | submit | spaintech's commentslogin

It personally feel that Forth if often overlooked as a solution. It’s great for lowlevel embedded work… even on complicated x86 hardware. I also think that people shy away because the tooling is thin and often DIY, but a Forth exokernel plus a single-purpose app can squeeze more from the hardware.

Tethered Forth programming on small devices is an underrated opportunity, IMO. There is also the opportunity to revise the OpenBoot project, I remember the days of automation for deployments at the Bios, it was an amazing tool for massive deployments of Sun T systems in telcos.


Not sure if you are aware of this, but LUNA does this already.

https://www.usenix.org/system/files/atc23-zhu-lingjun.pdf


My take on why Google bought Wiz is pretty straightforward. First off, Wiz brings a rock-solid CRM loaded with all those juicy contracts from the top cloud players. Add to that a proven enterprise team that knows exactly how to sell the product, and whom to sell to. And you’ve got a recipe for success. Every Wiz win is just a possible upsell for GCP; especially when GCP isn’t even the market leader in cloud. IMO, it opens the door to a whole lot of sales opportunities and deep-rooted relationships with top-tier cloud customers. To me, that all points to a pretty hefty price tag on the table


First and foremost, I have no affiliation with any of the authors previously mentioned. However, I would like to pose a question to the community:

Is it feasible to exploit these undocumented HCI commands to develop malicious firmware for the ESP32? Such firmware could potentially be designed to respond to over-the-air (OTA) signals, activating these hidden commands to perform unauthorized actions like memory manipulation or device impersonation.

However, considering that deploying malicious firmware already implies a significant level of system compromise, how does this scenario differ from traditional malware attacks targeting x86 architectures to gain low-level access to servers?


It is feasible to develop malicious firmware for the ESP32 even without these HCI commands. The existence of these undocumented commands doesn't change anything.


As the article states "These undocumented HCI commands cannot be triggered by Bluetooth, radio signals, or over the Internet, unless there is a vulnerability in the application itself or the radio protocols.". Hence I dont think there is any security risk here assuming the application and radio is safe.

It differs in a way that the person must have access to the device to flash firmware I believe. In x86 as you describe, the person could attack with a connection to the device/machine.


I agree, hence my direct comment of malicious firmware… For me, the open question is, can one still write a malicious firmware on the ESP32 without the non documented opcodes?


Yes. You can write whatever malicious firmware in a hardware you have physical access, with or without the undocumented opcodes. Not OTA though, unless there's a bug in the radio stack. Is not an open question.


HCI is an interface for the low level parts of the Bluetooth stack to exchange information with the higher levels. If you assume that higher level code is malicious, an OTA vulnerability is straightforward.


What would be the purpose of such firmware? The ESP32 is a complete SoC, the “firmware”, “OS”, and “application” are all the same binary.

So yes you could write a malicious “firmware” without using undocumented commands. But what would be the point? Said firmware already has complete execution privileges on the devices already, with the ability to read any memory it wants to, by virtue of said firmware being literally all the software running on the devices, and owning all of the memory.


This would require you to have root access to the thing, at that point you might as well write literally any code you like and not bother with the HCI commands.


Just wait until these security researchers learn about the LDR instruction that malicious software could use to read any memory.


No.

It is literally just a debug port exposed over the wired HCI interface.

This gives you absolutely nothing at all that you can't get with a normal UART debug port or JTAG. Everything in the HCI commands already exists in the normal bootloader. If you can get a device into bootloader mode, you can peek and poke flash and memory, along with everything else.

There is absolutely nothing here.

You can create malicious firmware, sure, but it has nothing to do with this HCI thing.


It was feasible even without these commands. If you already had code execution on the host then you could've already done what you wanted with the device


If an LLM’s logic is derived primarily from its training phase… essentially, by following patterns it has previously seen; doesn’t that underscore the critical role of training? We invest significantly in reinforcement learning and subsequent processes, so if the paper’s claim is accurate, perhaps we need to explore innovative approaches during the training phase


Correct, I updated the title of the original paper. Thank you your bringing up.


When a language model is trained for chain-of-thought reasoning, particularly on datasets with a limited number of sequence variations, it may end up memorizing predetermined step patterns that seem effective but don’t reflect true logical understanding. Rather than deriving each step logically from the previous ones and the given premises, the model might simply follow a “recipe” it learned from the training data. As a result, this adherence to learned patterns can overshadow genuine logical relationships, causing the model to rely on familiar sequences instead of understanding why one step logically follows from another.

In other words, language models are advanced pattern recognizers that mimic logical reasoning without genuinely understanding the underlying logic.

We might need to shift our focus on the training phase for better performance?


> As a result, this adherence to learned patterns can overshadow genuine logical relationships, causing the model to rely on familiar sequences instead of understanding why one step logically follows from another.

To be honest, even humans rarely get above this level of understanding for many tasks. I don't think most people really understand math above the level of following the recipes they learned by rote in school.

Or beyond following the runbook in their IT department's documentation system.

And when the recipe doesn't work, they are helpless to figure out why.


> instead of understanding why one step logically follows from another

There’s currently 0% chance of “understanding” happening at any point with this technology.


I mostly agree, but struggle with saying this with perfect certainty.

Understanding in the "have mental model of the world, apply it, derive thoughts from that model, derive words from thoughts" pattern is a thing they don't do the way we do.

But understanding of some kinds CAN be encoded into tokens and their relationships. They're clearly capable of novel, correct inferences, that are not directly contained within their training sets.

I all-but-guarantee my "My fish suffocated when I brought it to space, even though I gave it a space suit filled with pure water, why?" test case is not something it was explicitly trained on, but it correctly inferred "Because fish need oxygenated water"


How do we define understanding?


There are many ways to define it. Taking, for example the "definition" from Wikipedia here[0], you could say that LLMs are understanding, in a distilled form, because relationships is precisely what they're made of.

--

[0] - https://en.wikipedia.org/wiki/Understanding#Definition - though this feels more like vague musings than a definition proposal.


That claim sounds like a quote from a paper but it's not from the currently linked paper. The paper itself seems more like antidote to the problem and does seem to roughly assume the claim.

I like the claim and I'd guess it's true but this seems like a weird way to introduce it.


Isn't that what the study you linked to roughly proposes?


But John Carmack promised me AGI....


I haven't kept up with his tweets, but I got the impression he deliberately chose to not get involved in LLM hype in his own AI research?


This is a fascinating study, it reveals that ant groups significantly improve their problem-solving abilities through effective cooperation, whereas human groups do not show similar enhancements and can even perform worse when communication is limited. This difference is attributed to ants’ simple cognitive structures, which facilitate seamless collaboration, while humans’ complex cognition leads to variations that hinder efficient group performance. It seems that the advantages of collective cognition depend on the underlying cognitive and cooperative mechanisms of the species… found it a great read.


Dito on the upvote, esthetics are pleasing and I love the transition of the board. Not sure if that was sand or not… but now in hunting for the files… would make a nice gift for a friend that has it all, rated at 2000+ in chest.


When I read these articles, I always ask myself if this is more of a joint OS-ISA issue than just an ISA problem.

Wondering if a well defined OS system, with strict enforcement of memory boundaries at the OS level and at the application level, where the application sits in a well defined deterministic execution model would mitigate some of these unpredictable state transitions.

If one considers a minimalist OS, micro kernel for example, lowering the attack surface, would this not explicitly prevent access to certain microarchitectural states (e.g., by disallowing certain instructions like clflush or speculative paths)? This could be accomplished with a strict memory management jointly at the OS layer and the binary structure of the application… one where the binary has a well defined memory memory boundary. The OS just ensures it is kept with in these limits.


> well defined deterministic execution model would mitigate some of these unpredictable state transitions.

The problem here is that giving a program access to high-resolution (non-virtualized) timers violates deterministic execution. Even without a high-resolution timer, the non-determinism inherent in shared memory parallelism can be exploited to make high-resolution timers. In short, one can use a counter thread to make a very high precision timer.

With high-resolution timers, the timing domain becomes both a potential covert channel and a universal side-channel to spy on other concurrent computations.


Good point, but still, you are leaving the user with too much leverage on the underlying architecture, again from the OS’ perspective.

They way I’m considering this is, one could provide virtual time sources, removing the high resolution timers, where the OS has more of a coarse-grained timer. Not sure the implications, but if needed, one could add jitter or randomness ( Noise ) to the virtual timer values…

This would further prevent thread from running out of sync with the resto of the threads.

Further, one could also add a stack based shared memory model, LIFO would provide a highly predictable behavior from an application perspective. If you make it per process, the shared stack would then be confined to the application. No sure if possible ( haven given deep thought ) but the stacks could be confined to specific cache lines, removing the timing differences caused by cache contention…


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: