Good point, sounds like I intend to keep it that way for this particular databas...

gizzlon · on June 23, 2011

When sharding, do you do all joins in code, or just the ones that span several shards?

gtuhl · on June 23, 2011

If I need to go cross-shard then I am doing it in code. If you knew both shards were on the same box you could do cross-shard joins if you used schemas like this but you would need some potentially tricky logic that determines if it is working with all shards on the same machine.

Thankfully most of the joins happen within a shard (hashing and sharding on something like a user_id) with the exception being various analysis and aggregation queries.

Using PostgreSQL's schemas is admittedly not too different from just using many DBs in MySQL or something else but in practice I've found that extra layer of organization helps keep things neater. I can backup, move, or delete a specific schema/shard or I can backup, move, etc all shards on a machine by operating on the containing database.