Page MenuHomePhabricator

Import-time diff generation fails for particular commits in cluster mode
Closed, ResolvedPublic

Description

A cluster instance reported a commit that is not importing. This is the immediate issue in the daemon log:

bin/phd log
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000] [2016-08-24 10:58:30] EXCEPTION: (PhutilProxyException) Error while executing Task ID 161495. {>} (PhutilProxyException) Host returned HTTP/200, but invalid JSON data in response to a Conduit method call. {>} (PhutilJSONParserException) Parse error on line 1 at column 0: Expected one of: 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '[' at [<phutil>/src/parser/PhutilJSONParser.php:32]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000] arcanist(head=stable, ref.master=b32149495b85, ref.stable=e78618ce225d), libcore(), phabricator(head=stable, ref.master=62e129d7a6d1, ref.stable=b88348a55e35), phutil(head=stable, ref.master=e8eaafe04d64, ref.stable=b6f4e866fdb2), services(head=stable, ref.master=ee889d4725ab, ref.stable=7065cb4fe5fd)
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #0 <#3> PhutilJSONParser::parse(string) called at [<phutil>/src/utils/utils.php:1154]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #1 <#3> phutil_json_decode(string) called at [<phutil>/src/conduit/ConduitFuture.php:48]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #2 <#2> ConduitFuture::didReceiveResult(array) called at [<phutil>/src/future/FutureProxy.php:58]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #3 <#2> FutureProxy::getResult() called at [<phutil>/src/future/FutureProxy.php:35]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #4 <#2> FutureProxy::resolve() called at [<phutil>/src/conduit/ConduitClient.php:64]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #5 <#2> ConduitClient::callMethodSynchronous(string, array) called at [<phabricator>/src/applications/diffusion/query/DiffusionQuery.php:83]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #6 <#2> DiffusionQuery::callConduitWithDiffusionRequest(PhabricatorUser, DiffusionGitRequest, string, array) called at [<phabricator>/src/applications/differential/engine/DifferentialDiffExtractionEngine.php:45]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #7 <#2> DifferentialDiffExtractionEngine::newDiffFromCommit(PhabricatorRepositoryCommit) called at [<phabricator>/src/applications/differential/engine/DifferentialDiffExtractionEngine.php:212]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #8 <#2> DifferentialDiffExtractionEngine::updateRevisionWithCommit(DifferentialRevision, PhabricatorRepositoryCommit, array, PhabricatorDaemonContentSource) called at [<phabricator>/src/applications/repository/worker/commitmessageparser/PhabricatorRepositoryCommitMessageParserWorker.php:243]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #9 <#2> PhabricatorRepositoryCommitMessageParserWorker::updateCommitData(DiffusionCommitRef) called at [<phabricator>/src/applications/repository/worker/commitmessageparser/PhabricatorRepositoryCommitMessageParserWorker.php:42]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #10 <#2> PhabricatorRepositoryCommitMessageParserWorker::parseCommit(PhabricatorRepository, PhabricatorRepositoryCommit) called at [<phabricator>/src/applications/repository/worker/PhabricatorRepositoryCommitParserWorker.php:51]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #11 <#2> PhabricatorRepositoryCommitParserWorker::doWork() called at [<phabricator>/src/infrastructure/daemon/workers/PhabricatorWorker.php:122]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #12 <#2> PhabricatorWorker::executeTask() called at [<phabricator>/src/infrastructure/daemon/workers/storage/PhabricatorWorkerActiveTask.php:171]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #13 <#2> PhabricatorWorkerActiveTask::executeTask() called at [<phabricator>/src/infrastructure/daemon/workers/PhabricatorTaskmasterDaemon.php:22]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #14 PhabricatorTaskmasterDaemon::run() called at [<phutil>/src/daemon/PhutilDaemon.php:184]
Daemon 2010 STDE [Wed, 24 Aug 2016 10:58:31 +0000]   #15 PhutilDaemon::execute() called at [<phutil>/scripts/daemon/exec/exec_daemon.php:127]

This is probably because of non-UTF8 data in the diff, although I'll confirm this.

If so, the best fix may be to reorient this API to be file-based, as described in T10423. This will also fix behavior for very large diffs. The previous similar changes in connection with T10186 to improve handling of large individual files seem to have worked well.

This is also on the pathway through T11482.

There may be further issues here in the vein of T4045 / T5955, but those are more involved to attack. Switching to a file-based API would sidestep them.

Event Timeline

The next step I want to take is figuring out what the actual problematic response body looks like, but that's slightly tricky right now because there's no way to make Conduit calls from the CLI.

It's also sort of tricky to write a bin/conduit call since there are a few different ways it might reasonably want to route the request (in-process via ConduitCall, same-host with asymmetric keys, route-into-cluster which isn't completely trivial, use a token but there's a lot of overlap with arc call-conduit at that point). I think there's no generic way to send an asymmetric key request "into the cluster" since I believe we assume the client routes them? I may be misremembering that.

This isn't too important since I think I only need "same host", but I didn't see an obvious way to navigate that and produce a useful bin/conduit call.

Here's the little hacky script I ended up with:

<?php

require_once 'scripts/__init_script__.php';

$method = idx($argv, 1);
if (!$method) {
  throw new Exception(pht('Specify method to call!'));
}

echo tsprintf(
  "%s\n",
  pht('Reading method payload from stdin...'));

$payload = file_get_contents('php://stdin');
$payload = phutil_json_decode($payload);

$domain = id(new PhutilURI(PhabricatorEnv::getURI('/')))->getDomain();

$public_path = AlmanacKeys::getKeyPath('device.pub');
$public_key = Filesystem::readFile($public_path);
$private_path = AlmanacKeys::getKeyPath('device.key');
$private_key = Filesystem::readFile($private_path);
$private_key = new PhutilOpaqueEnvelope($private_key);


$client = id(new ConduitClient('http://repo005.phacility.net/'))
  ->setHost($domain)
  ->setSigningKeys($public_key, $private_key);

$result = $client->callMethodSynchronous($method, $payload);

echo id(new PhutilJSON())->encodeFormatted($result);

Not too surprisingly, the response is empty. AphrontResponse->encodeJSONForHTTPResponse() uses json_encode() and does not detect UTF8-related encoding failures, so the chance that this is a UTF8 issue is pretty high. We should also probably fix that so fail earlier and more explicitly (and don't emit HTTP 200).

I extracted the response body prior to encoding as a serialized()'d object so I can narrow down what's causing the encoding to fail and it looks reasonable, looking like it's almost certainly a UTF8 issue now.

The particular issue is that that the change removed a file containing the bytes ... 0xE5 0x76 .... The 0xE5 is probably an å in Latin 1.

This is pretty much what I expected, and I think T10423 is probably the right fix for this. This is a simplified reproduction case:

https://github.com/epriestley/poems/commit/4bdefe8d8f950094f172c07ccf0d4d3c56f03fd6

This went out a while ago and I confirmed the fix in production.