Page MenuHomePhabricator

Feedback for Schema Adjustments
Closed, ResolvedPublic

Description

A recent change to Phabricator causes it to apply a massive schema migration on upgrade.

  • Find history on this change in T1191.
  • Find a discussion of this new process in Managing Storage Adjustments.
  • Report issues with adjustment here.
  • Read the rest of this task for a general overview and introduction.

Motivation: In general, this migration changes columns with utf8 collation to use utf8mb4 (on most systems). The primary goal of this change is to enable support for 4-byte unicode characters, including emoji. Beyond emoji not working in comments, one major issue this currently causes is difficulty importing repositories containing emoji or other 4-byte characters anywhere in a commit message. For history and context, see T1191.

What to Expect: After applying patches, bin/storage upgrade will now apply schema adjustments. The first time you run the script it will apply these adjustments to almost the entire database, which may take some time (it took about 35 minutes on this install). You should plan for some downtime. You can learn more about the process in Managing Storage Adjustments.

This high-cost adjustment workflow is a one-time migration. The adjustment workflow will continue to run going forward, but should normally take only a few seconds after this first major adjustment.

Adjustment is Mostly Resumable: If adjustment takes way longer than you expect or you run into issues (like a surprise server restart or running out of disk space), it's generally safe to cancel it (or resume it if the cancel was forced). You can fix the issues and resume where you left off safely.

It's also possible to cancel adjustment and start Phabricator without completing the process, but this is not recommended and carries a low risk that something weird will happen. (It should broadly be safe, but we aren't testing it and can't guarantee it works properly.)

Encountering Issues? If you're hitting issues, let us know by leaving a comment here. We've adjusted this install and run with adjustments for several weeks, and a handful of installs have voluntarily run adjustment ahead of general release without any real issues, so we have reasonable confidence that this process works in common cases. However, there may be edge cases we haven't seen. The adjustment process also depends on the version of MySQL you're using, so we might not deal with some versions correctly.

After Adjustment: After adjustment, emoji will work properly. ?

Event Timeline

epriestley raised the priority of this task from to Normal.
epriestley updated the task description. (Show Details)
epriestley added a project: Infrastructure.
epriestley added a subscriber: epriestley.

Adjust has been run on our db and /config/database/ reports 'No Schema Issues'. However, if I try to create a ticket with a cute title like ? ? ? ? ? ? ? ? ? I get:

Unhandled Exception ("AphrontQueryException")
#1366: Incorrect string value: '\xF0\x9F\x98\xBA \xF0...' for column 'title' at row 1

Running with last upstream commit at rP2ac987714bcda4ac636291aa852e1cdf5a29381e

$ mysql --version
mysql  Ver 14.14 Distrib 5.6.17, for Linux (x86_64) using  EditLine wrapper

We narrowed this down in IRC to the call to set_charset('utf8mb4') failing with this error:

Can't initialize character set utf8mb4 (path: /usr/share/mysql/charsets/)

Specifically, the utf8mb4 character set is missing on the client.

We can detect this in a setup check by examining the character_set_* variables, but it's not yet clear what we should tell users to do about it.

Hi, my adjustment hangs. Wiith --trace I got

Found 321 issues(s) with schemata, detailed above.

You can review issues in more detail from the web interface, in Config > Database Status. To better understand the adjustment workflow, see "Managing Storage Adjustments" in the documentation.

MySQL needs to copy table data to make some adjustments, so these migrations may take some time.


    Fix these schema issues? [y/N] y

Dropping caches, for faster migrations...
>>> [325] <exec> $ '/var/www/phabricator/bin/cache' purge --purge-all
Purging remarkup cache...done.
Purging changeset cache...done.
Purging general cache...done.
<<< [325] <exec> 477,833 us
Fixing schema issues...
>>> [326] <query> SET SESSION sql_mode = 'STRICT_ALL_TABLES'
<<< [326] <query> 646 us
>>> [327] <query> ALTER TABLE `phabricator_repository`.`repository_path` COLLATE = 'utf8mb4_bin'

Thanks.

If you have a large amount of repository data, the migration may take a very long time. How long is it hanging for?

Equivalently, if you run this command manually from MySQL, does it complete quickly?

ALTER TABLE phabricator_repository.repository_path COLLATE = 'utf8mb4_bin'

like 2 hours.
I'm running this command from MySQL. How long is it expected to finish?

It depends on the size of the table.

What does this show?

phabricator/ $ ./bin/storage probe

The part of phabricator_repository:

phabricator_repository            6,551.8 MB   97.5%
    repository_branch                 0.0 MB    0.0%
    repository_badcommit              0.0 MB    0.0%
    edgedata                          0.0 MB    0.0%
    repository_summary                0.0 MB    0.0%
    repository_coverage               0.0 MB    0.0%
    repository_statusmessage          0.0 MB    0.0%
    repository_arcanistproject        0.0 MB    0.0%
    repository_symbol                 0.0 MB    0.0%
    repository_vcspassword            0.0 MB    0.0%
    repository_pushevent              0.0 MB    0.0%
    repository_mirror                 0.0 MB    0.0%
    repository_auditrequest           0.0 MB    0.0%
    repository_lintmessage            0.1 MB    0.0%
    repository                        0.1 MB    0.0%
    repository_pushlog                0.1 MB    0.0%
    repository_refcursor              0.3 MB    0.0%
    edge                              0.3 MB    0.0%
    repository_transaction            0.3 MB    0.0%
    repository_parents                0.4 MB    0.0%
    repository_commitdata             2.6 MB    0.0%
    repository_commit                 3.2 MB    0.0%
    repository_filesystem             5.0 MB    0.1%
    repository_pathchange           129.0 MB    1.9%
    repository_path               6,410.0 MB   95.4%
This comment was removed by kylejao.

Looks like that table has 6.5GB of data in it, so it could take MySQL quite a while to ALTER it. I would not normally expect it to take 2 hours, but if you have a slow machine or disk maybe it's just legitimately taking that long.

The machine shouldn't be slow. Let me wait longer. Thanks.

so is there anything I can do with this?

epriestley claimed this task.

@kylejao, I don't know why mysql would hang indefinitely on an ALTER TABLE statement. I can't reproduce that, and it seems like an issue with MySQL, not Phabricator. I don't know if there are more granular tools available to help understand why ALTER TABLE isn't working or not. You could try googling or looking on StackOverflow. It looks like Percona might have some tools, although I haven't used them.

Conceivably, you could try something like dumping the table to disk with MySQL dump, manually changing the engine in the dump file, and then loading it back in.

I'm going to resolve this in general, since it seems like we've addressed all the other issues that users encountered.

Okay, I ran adjustment without interrupting and it eventually proceed. However, I got this now.

$ ./phabricator/bin/storage adjust

Database                 Table              Name       Issues
phabricator_differential differential_draft key_unique Missing Key

Found 1 issues(s) with schemata, detailed above.

You can review issues in more detail from the web interface, in Config > Database Status. To better understand the adjustment workflow, see "Managing Storage Adjustments" in the documentation.

MySQL needs to copy table data to make some adjustments, so these migrations may take some time.


    Fix these schema issues? [y/N] y

Dropping caches, for faster migrations...
Purging remarkup cache...done.
Purging changeset cache...done.
Purging general cache...done.
Fixing schema issues...
Done.

Target                                                 Error
phabricator_differential.differential_draft.key_unique #1062: Duplicate entry 'PHID-DREV-m7oqagw6vu3bl4no6aeq-PHID-USER-azdooqmjjftwzuovycj2-di' for key 'key_unique'

Failed to make some schema adjustments, detailed above.
For help troubleshooting adjustments, see "Managing Storage Adjustments" in the documentation.

But I don't see a field named key_uniquein the differential_draft table

mysql> describe differential_draft;
+--------------+------------------+------+-----+---------+----------------+
| Field        | Type             | Null | Key | Default | Extra          |
+--------------+------------------+------+-----+---------+----------------+
| id           | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| objectPHID   | varbinary(64)    | NO   |     | NULL    |                |
| authorPHID   | varbinary(64)    | NO   |     | NULL    |                |
| draftKey     | varchar(64)      | NO   |     | NULL    |                |
| dateCreated  | int(10) unsigned | NO   |     | NULL    |                |
| dateModified | int(10) unsigned | NO   |     | NULL    |                |
+--------------+------------------+------+-----+---------+----------------+

Can you advice how to fix this?

Thanks again.

Run:

phabricator/ $ ./bin/storage upgrade -f --apply phabricator:20141106.uniqdrafts.php

Then run:

phabricator/ $ ./bin/storage adjust -f

Took almost 6 hours to apply all adjustment patches (our database is quite huge).
Some patches failed with Row size too large (> 8126) MySQL error. Increasing innodb_log_file_size parameter to 1G helped.
Then failed on phabricator_conduit.conduit_methodcalllog.method with #1406: Data too long for column 'method' at row ... message. Since it is log table, truncating all data solved the issue.