Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> not when kubernetes is setting right there.

Enjoy spending the rest of your life trying to get etcd to cooperate. If you think operating Kubernetes at scale is a cakewalk, you don't have the scale problems you think you do.

I'll take consul over etcd ten million times out of ten.



Realistically you're either deploying consul on top of 1-N kubernetes clusters, or Nomad.

If deploying on kubernetes, you now have all the problems that come with kubernetes, plus additional problems of trying to get consul working.

I spent a week just trying to stand up a federated multi-datacenter deployment of consul on EKS before my company decided it was too much hassle


the OP was suggesting that it's just obvious to use Kubernetes instead of Nomad.

I was saying that anyone who operates large scale Kubernetes knows that you will forever be dealing with tuning etcd and fighting to keep etcd alive. It's an underpinning service of Kubernetes.

Roblox's outage was related to the intricacies of running consul and mistakes that they made.

The point I was making was that I would rather, at this scale of operation, be running Nomad and optionally Consul and optionally dealing with the intricacies of Consul than running Kubernetes and being _forced_ to deal with what a miserable pain in the ass etcd is.

I was saying that running Kubernetes is _not_ the obvious choice -- at least once you're at 10^5+ systems.


Until you have 3 days of downtime and end up on a special build of consul that nobody else has while nearly decimating your companies stock price, reputation, and employee morale, the alternatives start looking better. The point I'm making is that it doesn't handle their scale today and that projects with larger communities, support, and usage exist today. No one said k8s was a silver bullet, it's just an alternative to an already failing infrastructure.


If you truly read and understood the Roblox post-mortem you would understand that the problems that they had with Consul were partly and unintentionally self-inflicted and partly due to BoltDB. The "special build" was just early access to Hashi's already-ongoing work to replace boltdb with bbolt, which has long-since shipped.

The hyperbole applied to the description of the harm done to Roblox here I'm just going to ignore. Roblox is still popular. The company still has the reputation of a rock-solid engineering department. Roblox's stock isn't doing much differently than the rest of technology companies on the market and the "damage" you mention is vastly overstated anyway. In fact, Roblox's stock literally had a huge rally after and PEAKED within two weeks after the outage.


I was interested in Nomad in the past, but Kubernetes is open source, so it has that going for it


Sad but true.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: