I think a huge issue that he somewhat glazed over is that he is refusing to use ...

robotadam · on June 22, 2011

We made the switch before 1.8 was released, and made the decision during the 1.6 cycle, give or take. Auto-sharding never worked reliably in our tests. I think schmichael does a good job explaining the new features that would have helped -- spares indexes, in particular -- but the core issues still remain (lock contention, no non-blocking online compaction).

schmichael · on June 22, 2011

(speaker here)

Adam is exactly right. I wish I could say I mentioned those details in the talk, but I'm afraid I ran out of time.

Just to re-iterate:

* Most slides refer to 1.4

* We evaluated auto-sharding around 1.5.x & 1.6.0 and it did not work reliably.

* We now use 1.8 for a few remaining small MongoDB datasets.

kristina · on June 22, 2011

For 2.0, there is some major work on locking being done (concurrency will be an ongoing theme in 2011) and online compaction has been implemented (although you have to kick it off, it's not yet automatic).

robotadam · on June 22, 2011

According to the docs, this is a blocking operation -- so it's inplace, but I wouldn't consider it to be online, as I would PostgreSQL's vacuum or Cassandra's major compactions: http://www.mongodb.org/display/DOCS/compact+Command

rbranson · on June 23, 2011

Let's also consider that MongoDB's entire I/O architecture and data structure layout is built around a global lock and in-place updates, which makes implementing non-blocking compaction nearly impossible without either a MAJOR overhaul of the codebase or if somehow they were able to rig up the replication system to internalize the advocated compact-on-replica-set-slave-and-then-rotate method.

PostgreSQL has an automatic cost-based vacuum and sophisticated adaptive free space and visibility maps. With log-then-checkpoint writes and MVCC, there's no interruption of the database at all to reclaim page space used by deleted rows.

Cassandra also has a write-ahead log, and flushes it's in-memory representation of data to immutable files on disk, which actually was a pretty brilliant decision. Dead rows are removed during a minor compaction, which can also merge data files together. The compaction is very fast because the files are immutable and in key-sorted order, making it all lock-free, sequential I/O.

In theoryland(tm), in-place updates seem better because you're only writing data once. In realityworld(R), the need for compaction, cost of random I/O, data corruption problems, and locking make it suck.