"INSERT INTO ... SELECT" queries require more database locks than "SELECT" queries
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	epriestley
	Jan 27 2021, 10:30 PM

Description

See PHI1954.

When updating the members of a project, Phabricator executes a query in this form to materialize parent project members:

mysql> INSERT IGNORE INTO edge (...) SELECT ... FROM edge ...

On the instance in PHI1954, this query is currently timing out while the instance is under load:

[2021-01-27 21:56:33] EXCEPTION: (AphrontLockTimeoutQueryException) #1205: Lock wait timeout exceeded; try restarting transaction
  at [<phabricator>/src/infrastructure/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:339]
...

This is a complex issue and my understanding of MySQL locking/transaction/replication behavior is imperfect, and the documentation doesn't seem to spell out the parameters of this scenario explicitly, but from related reading it seems like SELECT statements in this form must do additional locking to support replication at REPEATABLE READ isolation.

For example, see:

"INSERT INTO … SELECT Performance with Innodb tables" (Percona)
https://www.percona.com/blog/2006/07/12/insert-into-select-performance-with-innodb-tables/

One possible remedy is to reduce isolation to READ COMMITTED. This is probably fine in the context of Phabricator, but is a large change with potential far-reaching effects.

In cases where replication is not enabled at all (as here), the innodb_locks_unsafe_for_binlog option can be set to hint to MySQL that the operation does not need to be made safe for replication. However, this isn't a general solution and Phabricator expects most high-load installs to use replication, even though hosted instances currently do not.

A more narrow remedy is to reduce isolation to READ COMMITTED for this statement only. This still feels like playing with fire, since there are currently no other situations where we want to make live modifications to transaction levels, and the transaction isolation level is stateful (so it needs primitives to track the state and return to the previous isolation level).

A more crude approach is to simply decompose this into SELECT + INSERT. The edit sequence here is always: start transaction; modify edge table; materialize into edge table; commit transaction, so I think this is likely safe even in theory. The database state also converges to the correct state across edits, so even if this is vulnerable to some theoretical race, it's not likely to have a particularly significant effect on correctness.

Revisions and Commits

rP Phabricator
	D21527	rP6af8756e798d (stable) Execute project membership materialization as "SELECT" + "INSERT", not…
	D21527	rP9cbbbe2a8712 Execute project membership materialization as "SELECT" + "INSERT", not "INSERT .

Related Objects

Mentioned In: 2021 Week 5 (Very Late January)
T13478: Fix various policy issues with project member materialization around interactions between "Project Members" policies and Subprojects/Milestones

Event Timeline

epriestley triaged this task as Normal priority.Jan 27 2021, 10:30 PM

epriestley created this task.

epriestley added a revision: D21527: Execute project membership materialization as "SELECT" + "INSERT", not "INSERT ... SELECT".Jan 27 2021, 11:20 PM

Repeatable read is required if statement based replication is used - if row replication is used - repeatable read is no longer a requirement for replication consistency...

We internally run phabricator with READ COMMITTED isolation - world didn't end yet :)

epriestley mentioned this in T13478: Fix various policy issues with project member materialization around interactions between "Project Members" policies and Subprojects/Milestones.Jan 28 2021, 5:03 PM

epriestley added a commit: rP9cbbbe2a8712: Execute project membership materialization as "SELECT" + "INSERT", not "INSERT ..Jan 28 2021, 5:04 PM

epriestley added a commit: rP6af8756e798d: (stable) Execute project membership materialization as "SELECT" + "INSERT", not….Jan 28 2021, 5:14 PM

This change appears to have resolved the issue in PHI1954.

epriestley mentioned this in 2021 Week 5 (Very Late January).Feb 1 2021, 4:35 PM

"INSERT INTO ... SELECT" queries require more database locks than "SELECT" queriesClosed, ResolvedPublicActions

Description

Revisions and Commits

Related Objects

Event Timeline

"INSERT INTO ... SELECT" queries require more database locks than "SELECT" queries
Closed, ResolvedPublic
Actions