Things I wish we had known about scaling

danudey · on March 26, 2014

> In a database like PostgreSQL or MySQL, each client connection to the database is handled by a separate unix process

This isn't correct. They're often handled by threads, but not necessarily 1:1.

> Every connection adds overhead, so the entire database slows down, even if those connections aren’t actively processing queries.

I've never known this to be true, even in large production systems. Can anyone cite?

> Partitioning (sharding) and read replicas probably won’t help you with your connection limit, unless you can somehow load-balance requests so that all the requests for a particular partition are handled by a particular server instance

Sharding and read replicas are two very different ways of handling data; the issues cited as problems only affect sharing and not read slaves.

> That’s all doable, but it doesn’t seem a particularly valuable use of your time when you’re also trying to iterate on product features.

If you can't scale you database, then adding more functionality is a bad thing. Software engineering doesn't stop once you make an API call to someone else's software.

> in order to set up a new replica, you have to first lock the leader to stop all writes and take a consistent snapshot (which may take hours on a large database)

    mysqldump --single-transaction --master-data

You can even gzip this on-the-fly and stream it via SSH to the new server to avoid disk I/O on the local machine competing, or even connect remotely from another server via mysqldump to avoid the SSH overhead.

mysteriousllama · on March 26, 2014

To add to this: Cache and cache invalidation.

Without proper caching and a good invalidation strategy your databases will get pounded. Use redis and memcache to cache everything possible. Don't even connect to the database unless you have to. Ensure that you can invalidate any cache entry easily and keep things atomic so you do not run in to race conditions. Use locking to ensure that when the cache expires the database does not get a dog-pile with multiple copies of the same query. You'd think the query-cache in your database of choice may be just as efficient but trust me, it is not even close. You can also cache higher-level objects than just simple queries.

Depending on your reliability requirements you may even consider treating your cache as writeback and doing batched database writes in the background. These are generally more efficient than individual writes due to a variety of factors.

I've worked on several top-2oo ranking sites and this has always been one of the main go-to strategies for scaling. Databases suck - Avoid querying them.

s0enke · on March 31, 2014

Since MySQL 5.6, most schema changes (add/drop field) are possible online as well:

http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-o...

dalek2point3 · on March 27, 2014

"Database connections are a real limitation"

I repeat

"Database connections are a real limitation"

this should be one point, and the others should be one point. No?