Distributed Named Pipes

techdragon · on Dec 12, 2016

This is an excellent idea... but I'm a little put off by the reference implementation using DC/OS and Kafka... Its a little bit 'heavy' for a reference implementation, at least that's my personal preferences.

Id like to see a more 'low tech' version, or in the very least, a plan to transition from a 'heavy' reference implementation to a 'lighter' one once one became available since having a concise reference implementation makes porting and compatibility significantly easier.

bjt · on Dec 12, 2016

Yes.

> To try it out yourself, you first need a to install a DC/OS cluster and then Apache Kafka...

When I see that, I know the cost of satisfying my curiosity is going to be a lot higher than I'm willing to pay right now.

mhausenblas · on Dec 12, 2016

Fair point, I suppose. Is https://dcos.io/docs/1.8/administration/installing/local/ an option?

mhausenblas · on Dec 12, 2016

Author here. Thanks for the feedback and I agree, a lighter ref impl would be ideal (it's just that I happen to use DC/OS on a daily basis so for me this was the most natural thing to do)

I'm all open for GitHub issues that capture this request and/or PR ;)

MaBu · on Dec 12, 2016

If you want simple named distributed pipes there is https://github.com/lukasmartinelli/pipecat . Which is basically cat over AMQP.

jalfresi · on Dec 12, 2016

Agreed. I was expecting a low-fi example to point a dnpipe e.g.

$ mkdnpipe /tmp/a.fifo 192.168.1.10:123

With the target host running a dnpipe daemon.

icebraining · on Dec 12, 2016

Of course, one can already do that with socat, the multipurpose relay:

    socat PIPE:myfifo TCP-LISTEN:7766

    socat PIPE:myfifo2 TCP:192.0.2.1:7766

Tepix · on Dec 12, 2016

socat is great. You can also use OPENSSL-LISTEN and OPENSSL to replace TCP-LISTEN and TCP to get an encrypted SSL connection.

jalfresi · on Dec 12, 2016

Interesting! Thanks for the tip

equalunique · on Dec 12, 2016

Agreed - personally would like to see either Arch, a BSD, or NixOS.

brongondwana · on Dec 12, 2016

I'm put off by the domain name :p

dln_eintr · on Dec 12, 2016

The name doesn't really work in this project's favor.

UNIX pipes are stream interfaces, whereas this looks to be message based - that's a fundamental difference.

Named pipes are uniquely identified in a well-defined local namespace, i.e the filesystem, whereas this seems to be an abstraction on top of Kafka with service discovery TBD.

This confused me while reading up on the project.

codys · on Dec 12, 2016

AF_UNIX on linux supports SOCK_DGRAM and SOCK_SEQPACKET in addition to SOCK_STREAM, so having message based interfaces won't be much of a change for some users.

SOCK_DGRAM at least appears to be widely supported (FreeBSD & Illumos & OpenBSD). SOCK_SEQPACKET is at least supported by OpenBSD (I'm going by what's in various docs/man pages).

cle · on Dec 12, 2016

How is it a fundamental difference? Unix streams are character messages, so this is basically a superset of that functionality.

(I don't know much about Unix streams, so correct me if I'm wrong.)

michaelmior · on Dec 12, 2016

Generally when a protocol is referred to as message-based it's because there's some framing around the payload. This can make it impractical as compared to a streaming protocol that just passes data back in forth. In some real-time cases, you can't tolerate the latency of waiting to bundle multiple items in a single payload but you also can't tolerate the overhead of framing many small messages.

linsomniac · on Dec 12, 2016

The most interesting thing I got from this was the DC/OS link, I hadn't heard of it before but I think I'll look at it to see how it might fit with our future direction with/instead of Kubernetes. https://dcos.io/

tlrobinson · on Dec 12, 2016

So... a message queue?

rohan_ · on Dec 12, 2016

I'm just as confused. What does this have to do with named pipes?

pdkl95 · on Dec 12, 2016

http://beej.us/guide/bgipc/output/html/singlepage/bgipc.html...

"A message queue works kind of like a FIFO but supports some additional functionality."

rohan_ · on Dec 12, 2016

So it's a bare-bones message queue? What's the advantage then? Performance?

jsjohnst · on Dec 12, 2016

Apologies, but what does this provide that Kafka doesn't already?

Tepix · on Dec 12, 2016

It uses the file paradigm just like local named pipes. That makes this tech accessible to pretty much every UNIX command.

Animats · on Dec 12, 2016

Not really. Each message has to be an atomic item, not a stream of bytes, since there can be multiple readers consuming a single queue.

The interface specification has problems. "A pull does not remove a message from a dnpipes, it merely delivers its content to the consumer." So if the same consumer pulls the same dnpipe again, does it get the same message? Do messages ever get removed from dnpipes without a reset? Unclear.

Does pull block, support async completions, or just return an error when no data is available?

Reset, rather than sending an EOF which passes through the queue, implies that shutdown is either drastic or requires external coordination to empty the queue and stop the sending end before the reset.

mhausenblas · on Dec 12, 2016

> ... if the same consumer pulls the same dnpipe again, does it get the same message

No. Each message is delivered at most once. But good point, need to make that clearer!

> Do messages ever get removed from dnpipes without a reset.

No. Again, something I need to clarify as it seems.

> Does pull block, support async completions, or just return an error when no data is available?

It blocks.

> Reset, rather than sending an EOF which passes through the queue, implies that shutdown is either drastic or requires external coordination to empty the queue and stop the sending end before the reset.

I don't follow. Reset empties the underlying queue.

In general, since the consumers start to consume not from the beginning of time (as in Kafka's `--from-beginning`) but wherever they happen to be (that is, in Kafka terminology from `latest` index) this shouldn't be a problem.

I tried to model as close as possible and as it makes sense after the semantics of (local) named pipes. I might have fudged up here but I'm not 100% clear on where :)

Animats · on Dec 12, 2016

> Do messages ever get removed from dnpipes without a reset.

No. Again, something I need to clarify as it seems.

So what happens after the system has been running for a while? Why doesn't the dnpipe system fill up with old messages that will never be read again? If new subscribers don't see them, and old subscribers have read them, why are they not removed? It would seem that once every subscriber who subscribed before a message was sent has received that message, the message is dead and can be removed. Why keep the history? Did you really mean that?

Also, what happens if one of many subscribers stops making PULL requests? Maybe it's blocked on something, or hung. Do the queues start to build up? Does PUSH eventually block?

> Reset, rather than sending an EOF which passes through the queue, implies that shutdown is either drastic or requires external coordination to empty the queue and stop the sending end before the reset.

I don't follow. Reset empties the underlying queue.

On most queuing systems, when a publisher wants to shut down, they close the channel's sending end or send and EOF message. When all subscribers have read up to the EOF, they close their receiving end. The last messages get processed, and then the subscribers stop. If the only shutdown mechanism is a reset, that can lose messages not yet received. That's OK if the intent is just "kill everything and terminate", but not if you need a clean shutdown.

(ROS, the Robot Operating System, has a publish/subscribe system something like this. It's a soft real time system, so old data is discarded and you never want to block a publisher. They chose to lose messages if a subscriber isn't reading often enough. That's appropriate to a robotics use case. It probably wouldn't be for a containerized web backend. See [1].)

[1] https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_patt...

jsjohnst · on Dec 13, 2016

Exactly my thought when reading his reply!

mbrock · on Dec 12, 2016

The example doesn't show any files, just stdout/stdin based interaction.

You could make actual named pipes and connect them to the "dnpipes" tool, I guess...

pram · on Dec 12, 2016

I don't get it either, your example is indistinguishable from a normal Kafka producer/consumer ... ?

mhausenblas · on Dec 12, 2016

Define 'normal' :)

Joking aside: since I've used Kafka for the reference implementation it naturally looks very similar, yes but it's really about the semantics. For example, does your 'normal' Kafka consumer start from the beginning or use latest index, etc.

pram · on Dec 12, 2016

The 'normal' Kafka consumer starts from the last offset