Page MenuHomePhabricator

UTF8 exception occurring during Harbormaster build
Closed, ResolvedPublic

Description

I'm seeing this exception during a build:

exception 'AphrontCharacterSetQueryException' with message 'Attempting to construct a query using a non-utf8 string when utf8 is expected. Use the `%B` conversion to escape binary strings data.' in /srv/phabricator/libphutil/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:331
Stack trace:
#0 /srv/phabricator/libphutil/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php(10): AphrontBaseMySQLDatabaseConnection->validateUTF8String('      API.Funct...')
#1 /srv/phabricator/libphutil/src/xsprintf/qsprintf.php(170): AphrontMySQLiDatabaseConnection->escapeUTF8String('      API.Funct...')
#2 /srv/phabricator/libphutil/src/xsprintf/xsprintf.php(63): xsprintf_query(Object(AphrontMySQLiDatabaseConnection), 'UPDATE harborma...', 61, '      API.Funct...', 110)
#3 /srv/phabricator/libphutil/src/xsprintf/qsprintf.php(64): xsprintf('xsprintf_query', Object(AphrontMySQLiDatabaseConnection), Array)
#4 [internal function]: qsprintf(Object(AphrontMySQLiDatabaseConnection), 'UPDATE harborma...', '      API.Funct...', '      API.Funct...', '591772')
#5 /srv/phabricator/libphutil/src/xsprintf/queryfx.php(5): call_user_func_array('qsprintf', Array)
#6 /srv/phabricator/phabricator/src/applications/harbormaster/storage/build/HarbormasterBuildLog.php(140): queryfx(Object(AphrontMySQLiDatabaseConnection), 'UPDATE harborma...', '      API.Funct...', '      API.Funct...', '591772')
#7 /srv/phabricator/phabricator/src/applications/harbormaster/step/HarbormasterCommandBuildStepImplementation.php(98): HarbormasterBuildLog->append('      API.Funct...')
#8 /srv/phabricator/phabricator/src/applications/harbormaster/worker/HarbormasterTargetWorker.php(52): HarbormasterCommandBuildStepImplementation->execute(Object(HarbormasterBuild), Object(HarbormasterBuildTarget))
#9 /srv/phabricator/phabricator/src/infrastructure/daemon/workers/PhabricatorWorker.php(91): HarbormasterTargetWorker->doWork()
#10 /srv/phabricator/phabricator/src/infrastructure/daemon/workers/storage/PhabricatorWorkerActiveTask.php(156): PhabricatorWorker->executeTask()
#11 /srv/phabricator/phabricator/src/infrastructure/daemon/workers/PhabricatorTaskmasterDaemon.php(19): PhabricatorWorkerActiveTask->executeTask()
#12 /srv/phabricator/libphutil/src/daemon/PhutilDaemon.php(91): PhabricatorTaskmasterDaemon->run()
#13 /srv/phabricator/libphutil/scripts/daemon/exec/exec_daemon.php(111): PhutilDaemon->execute()
#14 {main}

I'm pretty sure utf8mb was supposed to fix this (and we are using utf8mb in our database), so I don't know what's going wrong.

Event Timeline

hach-que assigned this task to epriestley.
hach-que raised the priority of this task from to Needs Triage.
hach-que updated the task description. (Show Details)
hach-que added a project: Harbormaster.
hach-que added a subscriber: hach-que.

HarbormasterBuildLog uses %s when writing to the log chunk table. %s means "UTF8 string", and is validated (the string must really be UTF8). Use %B to write binary data. This is only loosely related to T1191.

The fix is to change stuff like this:

        'INSERT INTO harbormaster_buildlogchunk '.
        '(logID, encoding, size, chunk) '.
        'VALUES '.
-         '(%d, %s, %d, %s)',
+         '(%d, %s, %d, %B)',

When using Lisk, this is handled automatically if the column is listed in the CONFIG_BINARY config key, but for manual queries you need to do it yourself.

The major motivator for this behavior was to defuse a truncation attack which allowed registration with an email address like evil@evil.com\xEF\xEF\xEF@company.com to bypass domain whitelisting if MySQL was configured to silently truncate data on insert (which is the default).

A secondary motivation is that easy-to-fix issues like this are preferable to having a lot of mixed-encoding data in the database without knowing about it.