Page MenuHomePhabricator

Unhandled EXCEPTION: (AphrontCharacterSetQueryException) when running "scripts/repository/reparse.php"
Closed, DuplicatePublic

Description

I have some problems with importing Mercurial repo into Diffusion. It is stuck on 99.98%

I've found @epriestley comment to try

$ ./scripts/repository/reparse.php --message --change --owners --herald --force rPa57db1b3fce9eedbdcd06bf8087d4cda9b6fe5a2 --trace

but it eventually failed with

>>> [690] <query> SELECT path, id FROM `repository_path` WHERE pathHash IN ('703006a84176af41dc35745c4a58fd2a', '481c149ad01092c9665755b2dbf53c0b', '989e18b7d2c3d0e6bf3c0a9f9d18b9e0', '62fda553150ae50d6fc1c44652c46c1d', 'd974a3f5410856eee9e65a56e5d9e2b1', '8b27dfcb605e927dd3704478e9b5bbdf', '08bf5dc95c1f4c5bc31f5109e0b97e39', '7d09242eb742703c4cb2e4d55a385f39')
<<< [690] <query> 342 us
[2014-10-15 14:53:35] EXCEPTION: (AphrontCharacterSetQueryException) Attempting to construct a query containing characters outside of the Unicode Basic Multilingual Plane. MySQL will silently truncate this data if it is inserted into a `utf8` column. Use the `%B` conversion to escape binary strings data. at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:346]
  #0 AphrontBaseMySQLDatabaseConnection::validateUTF8String(string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:10]
  #1 AphrontMySQLiDatabaseConnection::escapeUTF8String(string) called at [<phutil>/src/xsprintf/qsprintf.php:170]
  #2 xsprintf_query(AphrontMySQLiDatabaseConnection, string, integer, string, integer) called at [<phutil>/src/xsprintf/xsprintf.php:63]
  #3 xsprintf(string, AphrontMySQLiDatabaseConnection, array) called at [<phutil>/src/xsprintf/qsprintf.php:64]
  #4 qsprintf(AphrontMySQLiDatabaseConnection, string, string, string) called at [<phabricator>/src/applications/repository/worker/commitchangeparser/PhabricatorRepositoryCommitChangeParserWorker.php:58]
  #5 PhabricatorRepositoryCommitChangeParserWorker::lookupOrCreatePaths(array) called at [<phabricator>/src/applications/repository/worker/commitchangeparser/PhabricatorRepositoryMercurialCommitChangeParserWorker.php:255]
  #6 PhabricatorRepositoryMercurialCommitChangeParserWorker::parseCommitChanges(PhabricatorRepository, PhabricatorRepositoryCommit) called at [<phabricator>/src/applications/repository/worker/commitchangeparser/PhabricatorRepositoryCommitChangeParserWorker.php:30]
  #7 PhabricatorRepositoryCommitChangeParserWorker::parseCommit(PhabricatorRepository, PhabricatorRepositoryCommit) called at [<phabricator>/src/applications/repository/worker/PhabricatorRepositoryCommitParserWorker.php:44]
  #8 PhabricatorRepositoryCommitParserWorker::doWork() called at [<phabricator>/src/infrastructure/daemon/workers/PhabricatorWorker.php:87]
  #9 PhabricatorWorker::executeTask() called at [<phabricator>/scripts/repository/reparse.php:281]

Logs in /var/tmp/phd/log/daemons.log are filled with same exception.

If this is a problem with repository, would be great to log more information.

Event Timeline

danbst raised the priority of this task from to Needs Triage.
danbst updated the task description. (Show Details)
danbst added a project: Diffusion.
danbst added a subscriber: danbst.

Seems like I've found a problem. There are filenames in CP1251 encoding, but Phabricator tries to parse them as UTF-8

/distribution/Amazon/performance tests/Тестирование серверов.xlsm

fucking Windows

Macro double-facepalm:

Got the very same issue over here.

Is there a way to specify encoding ?

What can I do to fix commits listed by ./bin/repository importing E ?

Looks similar to T6433, which is fixed by T6350.
Tried to upgrade, but nope still the same issue.

Actually I don't have the same exception:

[2014-12-03 11:18:45] EXCEPTION: (AphrontCharacterSetQueryException) Attempting to construct a query using a non-utf8 string when utf8 is expected. Use the `%B` conversion to escape binary strings data. at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:331]
  #0 AphrontBaseMySQLDatabaseConnection::validateUTF8String(string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:10]
  #1 AphrontMySQLiDatabaseConnection::escapeUTF8String(string) called at [<phutil>/src/xsprintf/qsprintf.php:170]
  #2 xsprintf_query(AphrontMySQLiDatabaseConnection, string, integer, string, integer) called at [<phutil>/src/xsprintf/xsprintf.php:63]
  #3 xsprintf(string, AphrontMySQLiDatabaseConnection, array) called at [<phutil>/src/xsprintf/qsprintf.php:64]
  #4 qsprintf(AphrontMySQLiDatabaseConnection, string, string, string) called at [<phabricator>/src/applications/repository/worker/commitchangeparser/PhabricatorRepositoryCommitChangeParserWorker.php:58]
  #5 PhabricatorRepositoryCommitChangeParserWorker::lookupOrCreatePaths(array) called at [<phabricator>/src/applications/repository/worker/commitchangeparser/PhabricatorRepositoryMercurialCommitChangeParserWorker.php:255]
  #6 PhabricatorRepositoryMercurialCommitChangeParserWorker::parseCommitChanges(PhabricatorRepository, PhabricatorRepositoryCommit) called at [<phabricator>/src/applications/repository/worker/commitchangeparser/PhabricatorRepositoryCommitChangeParserWorker.php:30]
  #7 PhabricatorRepositoryCommitChangeParserWorker::parseCommit(PhabricatorRepository, PhabricatorRepositoryCommit) called at [<phabricator>/src/applications/repository/worker/PhabricatorRepositoryCommitParserWorker.php:44]
  #8 PhabricatorRepositoryCommitParserWorker::doWork() called at [<phabricator>/src/infrastructure/daemon/workers/PhabricatorWorker.php:91]
  #9 PhabricatorWorker::executeTask() called at [<phabricator>/scripts/repository/reparse.php:281]

I'll probably create another issue.

Your error looks more like T6228 and T6433

Well actually it's the same exception, the message was slightly changed ~26 days ago, see rPHU0135e57181a91e1f342338d62782754e0bd31e59
Ok so let's sum this up:

Some commits cannot be imported:

rd@ly-phabricator:~/phabricator$ ./bin/repository importing E
rEfa3b374825f1f99d53bad22a1d4c52834342db93 Change, Owners, Herald
rE85369cdf3eb47d88cb39d11079b9dbf8b925371d Change, Owners, Herald
rE1ffcd1eea76daf11ff5b25605f8bc630af646227 Change, Owners, Herald
rE65718876b5791a548e5a0c6f6c136d38616842c4 Change, Owners, Herald
rEb46a1d9527c67a7b64993775e46d92f1d0608f2c Change, Owners, Herald
rd@ly-phabricator:~/phabricator$

When reparsing one of them with such command line:

rd@ly-phabricator:~/phabricator$ ./scripts/repository/reparse.php --message --change --owners --herald rEb46a1d9527c67a7b64993775e46d92f1d0608f2c

I got the expection specified in my previous message.

[2014-12-03 11:18:45] EXCEPTION: (AphrontCharacterSetQueryException) Attempting to construct a query using a non-utf8 string when utf8 is expected. Use the `%B` conversion to escape binary strings data. at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:331]

I've added some echo/print_r to troubleshoot was going on, here's my findings:

  • We're currently inside lookupOrCreatePaths from PhabricatorRepositoryCommitChangeParserWorker.php
  • There are some $missing_paths

However, same as @danbst, at least one file has invalid char in the filename, mine is:

Facture-Avoir N° 094169.PDF

Which is fine in windows explorer.

This comment was removed by PhoneixS.

@TiTi were you able to get around this issue?

I still have the issue, even when upgrading Phabricator.

So I did:

./bin/repository mark-imported <repository>

Which flag the repo. as imported so it keeps working.
However I still have the bad commits when doing:

./bin/repository importing <repository>

But I don't care that much because they are old commits.