Interesting, and the author clearly is an expert on the topic. However, I'm curi...

krizhanovsky · on Jan 15, 2014

This is a good question. Currently we've integrated the queue in two projects in production state: user-space proxy server and VPN capturer (Linux kernel module).

In both the cases we enhanced the queue with conditions like is the queue empty or is it full, so we can drop packets if consumers are overloaded and there is no sense to put a packet to the queue.

Secondly, the queue is designed to work on multi-core environments and this is usually multi-node NUMA systems for modern hardware. So we adjusted both the applications in such manner that administrator can assign different number of cores for all processors for consumers and producers depending on current workload. This makes the system mode balanced, so queue is empty or full rarely.

And finally, for kernel space we also implemented lightweight also lock-less condition wait for the queue (http://natsys-lab.blogspot.ru/2013/08/lock-free-condition-wa...). It also gives us lower CPU consumption when there is no workload (so the system eats less power) and even better performance due to reduced cache bouncing.

jws · on Jan 12, 2014

…in which case this algorithm will be terrible because it busy-waits instead of having condition variables…

I'm seeing sched_yield() calls in there. It looks like a blocked process will yield its CPU core to available, productive work.

If there isn't enough work to keep all the CPU cores busy then it will spin around asking "Am I ready?" more often than required, but at that point you have a machine under less than full load and it doesn't really matter. (power use aside, also assuming they stay in cache while doing that).