MySQL foreign key cascade operations finally hit the binary log

TexanFeller · 2026-02-14T00:38:39 1771029519

One of the main reasons I use Postgres is I've rarely(never?) seen an article like this posted about it. Every time I've touched MySQL I've found a new footgun.

ahofmann · 2026-02-14T01:25:15 1771032315

MySQL is the PHP of databases. It was free, easy to setup and had the spotlight at the right time. The bad decisions that are baked into MySQL are plenty and really sad (like the botched utf8 defaults, the myisam storage engine, strange stuff around replication and mucu more)

evanelias · 2026-02-14T15:45:33 1771083933

InnoDB became the default storage engine over 15 years ago. MyISAM is barely used by anyone today.

What "strange stuff around replication" are you referring to?

ahofmann · 2026-02-14T20:41:00 1771101660

I don't have all the details anymore. But one of the non obvious things for me was that foreign key cascades where not in the binlogs. I also think that some changes in the database layout could lead to strange things on the replicas.

evanelias · 2026-02-14T22:58:42 1771109922

> foreign key cascades where not in the binlogs.

True, but that's finally solved now.

> I also think that some changes in the database layout could lead to strange things on the replicas.

I've been using MySQL for 23 years and have no idea what you're referring to here, sorry. But it's not like other DBs have quirk-free replication either. Postgres logical replication doesn't handle DDL at all, for example.

derekperkins · 2026-02-21T22:39:43 1771713583

I appreciate you carrying the torch for MySQL here, since most opinions are based on setups over a decade old, with little to no bearing on how it runs today

booi · 2026-02-14T03:57:06 1771041426

It still blows my mind that they called that crappy partial buggy characterset “utf8”. Then later came out with actual utf8 and called it “utf8mb4”. Makes no sense

evanelias · 2026-02-14T05:03:56 1771045436

They should have addressed it much earlier, but it makes way more sense in historical context: when MySQL added utf8 support in early 2003, the utf8 standard originally permitted up to 6 bytes per char at that time. This had excessive storage implications, and emoji weren't in widespread use at all at the time. 3 bytes were sufficient to store the majority of chars in use at that time, so that's what they went with.

And once they made that choice, there was no easy fix that was also backwards-compatible. MySQL avoids breaking binary data compatibility across upgrades: aside from a few special cases like fractional time support, an upgrade doesn't require rebuilding any of your tables.

nofriend · 2026-02-14T05:14:16 1771046056

Your explanation makes it sound like an incredibly stupid decision. I imagine what you're getting at is that 3 bytes were/are sufficient for the basic multilingual plane, which is incidentally also what can be represented in a single utf-16 byte pair. So they imposed the same limitation as utf-16 had on utf-8. This would have seemed logical in a world where utf-16 was the default and utf-8 was some annoying exception they had to get out of the way.

evanelias · 2026-02-14T05:20:47 1771046447

OK, but that makes perfect sense given utf-16 was actually quite widespread in 2003! For example, Windows APIs, MS SQL Server, JavaScript (off the top of my head)... these all still primarily use utf-16 today even. And MySQL also supports utf-16 among many other charsets.

There wasn't a clear winner in utf-8 at the time, especially given its 6-byte-max representation back then. Memory and storage were a lot more limited.

And yes while 6 bytes was the maximum, a bunch of critical paths (e.g. sorting logic) in old MySQL required allocating a worst-case buffer size, so this would have been prohibitively expensive.

booi · 2026-02-26T23:11:14 1772147474

This still makes no sense. The UTF-8 standard was adopted really in 1998-ish and the standard was already variable using 1 to 4 bytes. MySQL 4.1, which introduced the utf8 charset, was released in 2004.

Even if there were no codepoints in the 4-byte range yet, they could and should have implemented it anyway. It literally does not take any more storage because it is a variable width encoding.

evanelias · 2026-02-28T00:19:36 1772237976

> The UTF-8 standard was adopted really in 1998-ish and the standard was already variable using 1 to 4 bytes.

No, it was 1 to 6 bytes until RFC 3629 (Nov 2003). AFAIK development of MySQL 4.1 began prior to that, despite the release not happening until afterwards.

Again, they absolutely should have addressed it sooner. But people make mistakes, especially as we're talking about a venture-funded startup in the years right after the dot-com crash.

> It literally does not take any more storage because it is a variable width encoding.

I already addressed that in my previous comment: in old versions of MySQL, a number of critical code paths required allocating worst-case buffer sizes, or accounting for worst-case value lengths in indexes, etc. So if a charset allows 6 bytes per character, that means multiplying max length by 6, in order to handle the pathological case.

jbonatakis · 2026-02-14T06:56:57 1771052217

This is excellent. In the past when replicating via Debezium from a system making heavy use of cascade deletes I’ve had to write a layer that infers these deletes by introspecting the database schema, building a graph of all cascades (sometimes several layers) and identifying rows that should have corresponding delete records. These can then be excluded in whatever downstream system via an anti-join. It works but it will be better to not have to do that and instead have first class support for cascades.

XCSme · 2026-02-14T00:16:09 1771028169

I always end up disabling bin log for single-db setups, and simlly run backup jobs. Using bin log drastically reduces performance. Am I crazy?

evanelias · 2026-02-14T05:15:05 1771046105

The performance impact depends substantially on whether you've configured it to fsync the binlog on every group commit.

Also, it's important to consider that replication and backups serve different purposes. Backups alone are insufficient for high availability, change data capture, point-in-time recovery / undoing a bad change, etc.

XCSme · 2026-02-14T10:04:40 1771063480

In my case it's for analytics, so I am ok with some data loss in case of a failure.

How do I set the fsync stuff? Does if have to be turned off?

evanelias · 2026-02-14T15:24:24 1771082664

The sync_binlog server variable controls this behavior. The default of 1 means to fsync every time, which is best for durability but worst for performance. See https://dev.mysql.com/doc/refman/8.4/en/replication-options-...

toast0 · 2026-02-14T03:13:37 1771038817

Turning it off cause you're not using it seems reasonable, but I'm surprised it has a big effect on performance. Sequential appends to a file are pretty easy as long as you're not doing so many writes per second that there's contention on the write.

XCSme · 2026-02-14T09:59:40 1771063180

I am doing many writes inserts/updates per second.

xhcuvuvyc · 2026-02-14T05:30:28 1771047028

Fsync

XCSme · 2026-02-14T10:03:05 1771063385

Do you mean fsync can lead ro poor performance?

ahofmann · 2026-02-14T01:20:39 1771032039

I've used bin log for almost decades and never experienced a big performance impact. This even holds for write heavy MySQL instances in ancient times where servers had spinning disks.

hobs · 2026-02-14T01:58:09 1771034289

Nonetheless its not a WAL, and if you are not replicating it idk why you'd have it enabled, and if you are replicating you must.

XCSme · 2026-02-14T10:01:19 1771063279

Thinking back about it, I think the biggest issue was the size, not performance. For a write-intensive app, the bin long quickly got to tens of GBs and filled the entire disk, which was a problem when running the app on smaller VPSs.

evanelias · 2026-02-14T15:36:31 1771083391

You can tune binlog_expire_logs_seconds to control how long old binlog files stay around. The default is 2592000 seconds (30 days) which is often too long.

garaetjjte · 2026-02-14T17:41:58 1771090918

>Authors: Marcelo Altmann

Is there any LLM with that name?