Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Special instructions won't help latency. There are prefetch instructions that basically just do loads. Each load in x64 (at least on Intel) grabs 128 bytes (two cache lines) from memory to cache minimum.

The Cell from Sony did have explicit cache control (I think) and it was notoriously difficult to program for.

The real reason cache isn't handled explicitly though is because it isn't necessary. Getting good performance and cache usage can be done at the C++ level, you just have to know how the CPU works and access memory linearly so it can be prefetched. I've tried to use prefetch instructions and beating the out of order buffer in the CPU is actually very difficult.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: