More

robotadam · on Jan 31, 2014

Opening the site in Firefox avoided the redirect to login issue for me.

robotadam · on Sept 19, 2011

I agree with you; a lot of the reason it took us a while to get out of EC2 was my lack of comfort. If I had pushed that earlier many things would have been easier. There are so many great things about EC2, but it pushes the need to think about scaling earlier in many cases, which means that you're thinking about scaling when you still need to be concentrating 100% on product/market fit. We were very fortunate that the first product we built fit the market well, but if we had been wrong the delay could have cost us.

robotadam · on Sept 19, 2011

I definitely could have been more clear on that. Cassandra has so many great properties, and when we made the decision to use Postgres for the large dataset under question was shortly after 0.7 was released, and it took a while to get more stable.

robotadam · on Sept 19, 2011

The Heroku guys are doing some seriously awesome things with Postgres. I can't wait to see where they go with it; the whole community will benefit, I'm sure.

robotadam · on Sept 19, 2011

We make heavy use of queue and internal messaging systems. This discussion was solely about the data storage layers, which always causes us more headaches than the messaging/queuing systems.

moe · on Sept 19, 2011

You should check out redis if you haven't yet.

It's ideal for the counter use-case and with bit of smart thinking you can fit a surprising amount of data into a 32G or 64G machine (and then you can always shard).

dhpye · on Sept 19, 2011

Okay, that makes more sense. It just sounds like there's a lot of touching going on. So, what % of your notifications require synchronous db writes?

robotadam · on Sept 19, 2011

Sorry about that -- I couldn't speak as much about our Hbase experience, but I can speak to the toll its taken on our ops team so far, which is very high, with lots of crashes.

eonnen · on Sept 19, 2011

To clarify, we've had lots of region server crashes, mostly due to our own data model and generally not as a result of any intrinsic fault of HBase. To my knowledge, not many of these (any?) have actually resulted in a total HBase failure. The system generally degrades as it should.

robotadam · on Sept 19, 2011

You are very right; I should have tried to have us move to physical hardware before we did. Definitely one of the things that I would have done differently, in hindsight.

The larger EC2 instances (especially for always-on systems like primary databases) do get quite a bit cheaper with the 1-year reserved instance reservations, so if you are on EC2 be sure to get those as soon as you're at a somewhat stable point.

catch23 · on Sept 19, 2011

I think the introduction of the SSD is causing site owners to "leave the cloud". I haven't seen any cloud hosting that lets you choose SSD-based disks yet, probably because it's low demand and high expense. The throughput of a single sata3 SSD blows away any kind of raid setup you can accomplish on EC2.

Also, I don't think you even need $10k in hardware. Sounds like you could do just fine with a $3k 1U server. $3k can still get you 16 cores w/ 32GB memory and SSD drives.

jacques_chester · on Sept 20, 2011

SSDs make a stunning difference in performance for large, complex joins that won't fit in RAM. However, I imagine that in 5-10 years time 'the cloud' will use them too; even if only as part of a larger storage pool.

ericd · on Sept 20, 2011

Ehhh those SLC drives are still mighty pricey, if you want a decent number of them raided, you could probably eat up nearly 10k. And the nice Xeons are expensive.

jdunck · on Sept 20, 2011

You're not factoring load balancer, redundant slave, etc.

robotadam · on Sept 19, 2011

Thanks for the kind words, everyone. When I was writing the talk I was worried it'd be too tied to our/my experience, but at least people found it a little entertaining. The video should be posted soon; the Q&A was interesting and brought up some pieces that I glossed over during writing.

robotadam · on June 22, 2011

We made the switch before 1.8 was released, and made the decision during the 1.6 cycle, give or take. Auto-sharding never worked reliably in our tests. I think schmichael does a good job explaining the new features that would have helped -- spares indexes, in particular -- but the core issues still remain (lock contention, no non-blocking online compaction).

schmichael · on June 22, 2011

(speaker here)

Adam is exactly right. I wish I could say I mentioned those details in the talk, but I'm afraid I ran out of time.

Just to re-iterate:

* Most slides refer to 1.4

* We evaluated auto-sharding around 1.5.x & 1.6.0 and it did not work reliably.

* We now use 1.8 for a few remaining small MongoDB datasets.

kristina · on June 22, 2011

For 2.0, there is some major work on locking being done (concurrency will be an ongoing theme in 2011) and online compaction has been implemented (although you have to kick it off, it's not yet automatic).

robotadam · on June 22, 2011

According to the docs, this is a blocking operation -- so it's inplace, but I wouldn't consider it to be online, as I would PostgreSQL's vacuum or Cassandra's major compactions: http://www.mongodb.org/display/DOCS/compact+Command

rbranson · on June 23, 2011

Let's also consider that MongoDB's entire I/O architecture and data structure layout is built around a global lock and in-place updates, which makes implementing non-blocking compaction nearly impossible without either a MAJOR overhaul of the codebase or if somehow they were able to rig up the replication system to internalize the advocated compact-on-replica-set-slave-and-then-rotate method.

PostgreSQL has an automatic cost-based vacuum and sophisticated adaptive free space and visibility maps. With log-then-checkpoint writes and MVCC, there's no interruption of the database at all to reclaim page space used by deleted rows.

Cassandra also has a write-ahead log, and flushes it's in-memory representation of data to immutable files on disk, which actually was a pretty brilliant decision. Dead rows are removed during a minor compaction, which can also merge data files together. The compaction is very fast because the files are immutable and in key-sorted order, making it all lock-free, sequential I/O.

In theoryland(tm), in-place updates seem better because you're only writing data once. In realityworld(R), the need for compaction, cost of random I/O, data corruption problems, and locking make it suck.

robotadam · on April 21, 2011

In addition, some ELBs are seeing failures with routing, so that wouldn't help, either.