Page MenuHomePhabricator

Maybe fix various statements that MySQL statement-based replication gets upset about
Open, NormalPublic

Description

We do a lot of stuff that MySQL replication does not really like. We should probably not do that stuff, or at least have a consistent story about why we're doing that stuff.

Stuff it's complaining about:

  • INSERT ... ON DUPLICATE KEY UPDATE on tables with multiple unique keys.
  • DELETE statements with LIMITs (?)

Repository Status

160414 20:50:06 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. INSERT... ON DUPLICATE KEY UPDATE  on a table with more than one UNIQUE KEY is unsafe Statement:

INSERT INTO `repository_statusmessage`
          (repositoryID, statusType, statusCode, parameters, epoch)
          VALUES ...

General Cache

INSERT INTO `cache_general`
              (cacheKeyHash, cacheKey, cacheFormat, cacheData,
                cacheCreated, cacheExpires) VALUES ...

Most Garbage Collection

[Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. The statement is unsafe because it uses a LIMIT clause. This is unsafe because the set of rows included cannot be predicted. Statement:

DELETE FROM `metamta_receivedmail` WHERE dateCreated < 1452890862 LIMIT 100

Event Timeline

Alternatively it seems like we can switch to MIXED mode and recommend that, but some installs probably won't be able to and I'm not sure what the tradeoffs are.

It looks like the DELETE statements can not be expressed in a way which doesn't upset the "unsafeness detector", even when explicitly ordered by primary key. I think this is the relevant MySQL issue:

http://bugs.mysql.com/bug.php?id=42415

However, these are all contained in the GC logic and we could reasonably express these as SELECT statements, followed by DELETE ... WHERE id IN (...) statements.


The "multiple unique keys" issue is much trickier, and resolving it is not nearly as easy.

I'm pretty sure MySQL's "unsafeness detector" is getting these wrong, too -- although the tables have multiple unique keys, one key is an autoincrement column and not specified in the query, so the statement can not possibly be ambiguous or resolve differently on the slave.

I'm going to switch to MIXED mode for now, which seems to be the general recommendation, and we can deal with this in greater detail if/when installs that use statement-based replication and are unwilling or unable to switch arise. It's possible that everyone who knows what they're doing is already using MIXED replication and this is moot.

epriestley renamed this task from Fix various statements that MySQL replication gets upset about to Maybe fix various statements that MySQL statement-based replication gets upset about.Apr 14 2016, 9:37 PM