Page MenuHomePhabricator

Support an "overlay" database connection mode where multiple applications share a single connection
Open, LowPublic

Description

See T11044 for some context.

Now that applications can be partitioned, we can get rid of one-connection-per-application. This will provide a slight performance improvement and a significant reduction in the number of connections we need to create and hold open. These benefits are greatest for installs running on toasters and gameboys, since serious installs should have low web/database latency and little difficulty handling connection counts, especially with the persistent flag (see D16913).

The major blocker here is that everywhere Phabricator references a table when constructing a query it needs to reference a full database_name.table_name instead, because the connection it acts on may be connected to a different database.

Something like ~60-ish% of these can be fixed with a couple of updates to PolicyAwareQuery and LiskDAO, but there will be a long tail of cases where we dropped down to direct database access and the fix isn't quite as clean.

I think the implementation pathway is something like this:

  • Define a new PhutilDatabaseTableReferenceInterface which provides getDatabaseName() / getTableName() methods.
  • Have LiskDAO implement that.
  • Have qsprintf() accept it with some new %R (or whatever) and produce a full database + table reference.
  • Maybe have a generic, concrete version of it for ad-hoc uses? Might simplify them?
  • Find all %T in the codebase, convert it to %R.
  • Implement an overlay flag into cluster.databases.

This will also let us pass table objects directly to %R instead of needing to do $thing->getTableName(), which is a small convenience benefit.

Event Timeline

The next step here is to find (almost) every instance of %T in the codebase and convert it to %R. In most cases, the conversion is trivial:

@@ -1,4 +1,4 @@
 queryfx(
   $conn,
-  'SELECT * FROM %T ...',
+  'SELECT * FROM %R ...',
-  $object->getTableName());
+  $object);

In cases where the table is referenced with a constant, the change should usually look like this:

queryfx(
  $conn,
-  'SELECT * FROM %T ...',
+  'SELECT * FROM %R ...',
-  SomeClass::TABLE_XYZ,
+  SomeClass::newXYZDatabaseTableRef());

We might need a helper for defining newXYZDatabaseTableRef(), like LiskDAO->newRefForTableOnSameDatabase(...). Then we'd get something like:

final class DiffusionPathChange {

...

final public static functioon newPathChangeDatabaseTableRef() {
  return id(new PhabricatorRepository())
    ->newDatabaseTableRefOnSameDatabase(self::TABLE_NAME);
}

...

}

In cases where this is impractical or the %T is in a migration or something, a ref can be built inline/manually with:

new AphrontDatabaseTableRef('database_name', 'table_name');

For migrations, we may also need a mode to force connections not to overlay. If we can reasonably convert almost all of them to %R we're fine, but if this is a huge untestable mess it may be simpler to just say "no overlay for migrations". There is no practical benefit to supporting overlay during migrations so this would be a small loss (just some additional complexity because the mode would need to exist).

Elsewhere, I compacted Phacility free instances onto a small number of database servers. I attempted to "fill up" these servers, and succeeded, sort of, but maybe too much -- they eventually fall over and need to be rebooted.

I suspect this is swapping to death, although I don't have much time to keep an eye on this stuff these days and haven't actually observed it firsthand.

These hosts have 8G physical and 4G swap, and innodb_buffer_pool_size=6144MB.

Here's a recently rebooted host:

$ uptime
 20:04:51 up 1 day, 17 min,  1 user,  load average: 0.56, 0.46, 0.48

$ mysql ...
Your MySQL connection id is 4767963

$ free -m
              total        used        free      shared  buff/cache   available
Mem:           7948        6327         114           0        1506        1329
Swap:          4095        3277         818

$ ps auxwww 
...
mysql       3631 37.8 70.6 10939908 5751608 ?    Ssl  Mar31 550:05 /usr/sbin/mysqld
...

That's about what I expect, i.e. mysqld is supposed to use about 75% of the physical memory and is correctly using about 75% of the physical memory.

Here's a less-recently rebooted host:

$ uptime
 20:06:38 up 9 days, 22:00,  1 user,  load average: 0.37, 0.38, 0.35

$ mysql ...
Your MySQL connection id is 33071573

$ free -m
              total        used        free      shared  buff/cache   available
Mem:           7948        7600         122           0         225          99
Swap:          4095        4095           0

$ ps auxwww
mysql       1134 19.8 91.3 13493752 7438992 ?    Ssl  Mar22 2838:11 /usr/sbin/mysqld

This host seems not-so-great.

From here I have no clue what I'm doing, but:

Unhealthy Server
mysql> select EVENT_NAME Event, ROUND(CURRENT_NUMBER_OF_BYTES_USED/(1024*1024), 0) 'Current Usage (MB)' from memory_summary_global_by_event_name ORDER BY CURRENT_NUMBER_OF_BYTES_USED DESC LIMIT 10;
+-----------------------------------------------------------------------------+--------------------+
| Event                                                                       | Current Usage (MB) |
+-----------------------------------------------------------------------------+--------------------+
| memory/innodb/buf_buf_pool                                                  |               6279 |
| memory/sql/THD::main_mem_root                                               |               3001 |
| memory/innodb/hash0hash                                                     |                167 |
| memory/performance_schema/table_io_waits_summary_by_index_usage             |                107 |
| memory/sql/dd::objects                                                      |                101 |
| memory/performance_schema/events_errors_summary_by_thread_by_error          |                 86 |
| memory/performance_schema/table_shares                                      |                 80 |
| memory/performance_schema/events_statements_summary_by_thread_by_event_name |                 72 |
| memory/performance_schema/memory_summary_by_thread_by_event_name            |                 55 |
| memory/performance_schema/file_instances                                    |                 52 |
+-----------------------------------------------------------------------------+--------------------+
10 rows in set (0.02 sec)
Healthy Server
mysql> select EVENT_NAME Event, ROUND(CURRENT_NUMBER_OF_BYTES_USED/(1024*1024), 0) 'Current Usage (MB)' from memory_summary_global_by_event_name ORDER BY CURRENT_NUMBER_OF_BYTES_USED DESC LIMIT 10;
+-----------------------------------------------------------------------------+--------------------+
| Event                                                                       | Current Usage (MB) |
+-----------------------------------------------------------------------------+--------------------+
| memory/innodb/buf_buf_pool                                                  |               6279 |
| memory/sql/THD::main_mem_root                                               |                400 |
| memory/innodb/hash0hash                                                     |                167 |
| memory/performance_schema/table_io_waits_summary_by_index_usage             |                107 |
| memory/sql/dd::objects                                                      |                100 |
| memory/performance_schema/table_shares                                      |                 80 |
| memory/performance_schema/file_instances                                    |                 50 |
| memory/performance_schema/events_errors_summary_by_thread_by_error          |                 49 |
| memory/innodb/memory                                                        |                 47 |
| memory/performance_schema/events_statements_summary_by_thread_by_event_name |                 41 |
+-----------------------------------------------------------------------------+--------------------+
10 rows in set (0.02 sec)

So it looks like whatever memory/sql/THD::main_mem_root is has grown from 400MB to 3,000MB over ~9 days, and everything else is basically stable and identical between the two hosts.

Googling "memory/sql/THD::main_mem_root" isn't terribly useful. "THD" is probably "Thread".

This might be related to setting max_connections to a fairly ridiculous value (16384), since that cascades into a bunch of thread-related stuff, but I can't immediately find a smoking gun explaining exactly what's happening.

(Out of curiosity, I set max_connections to 1024 on db-0568738a.phacility.net and rebooted it.)

One mitigation for this is likely to implement this task and overlay all the logical connections from the daemons, one per instance.

Another is to maybe just sunset free instances explicitly. Almost no one uses these anyway and almost no one notices when they break, I just maintain them forever.

With max_connections at 1024, the server above still looks fine after ~14 days:

mysql> select EVENT_NAME Event, ROUND(CURRENT_NUMBER_OF_BYTES_USED/(1024*1024), 0) 'Current Usage (MB)' from memory_summary_global_by_event_name ORDER BY CURRENT_NUMBER_OF_BYTES_USED DESC LIMIT 10;
+-----------------------------------------------------------------------------+--------------------+
| Event                                                                       | Current Usage (MB) |
+-----------------------------------------------------------------------------+--------------------+
| memory/innodb/buf_buf_pool                                                  |               6279 |
| memory/innodb/hash0hash                                                     |                167 |
| memory/performance_schema/table_io_waits_summary_by_index_usage             |                107 |
| memory/performance_schema/table_shares                                      |                 80 |
| memory/performance_schema/file_instances                                    |                 50 |
| memory/innodb/memory                                                        |                 41 |
| memory/performance_schema/events_statements_summary_by_digest               |                 40 |
| memory/performance_schema/events_errors_summary_by_thread_by_error          |                 37 |
| memory/sql/TABLE_SHARE::mem_root                                            |                 32 |
| memory/performance_schema/events_statements_summary_by_thread_by_event_name |                 31 |
+-----------------------------------------------------------------------------+--------------------+
10 rows in set (1.37 sec)

...so I'm going to deploy this everywhere.