Page MenuHomePhabricator

Arc is using pirate-speak and throwing an error due to unrecognized heading
Closed, DuplicatePublic

Description

I turned on the Pirate language in our phabricator installation, and I noticed that pretty recently when I make a Diff using arc diff that I get a "Spies:" header in the message template where others in my organization are getting "Subscribers:".

Screen Shot 2016-07-12 at 11.04.11 AM.png (173×200 px, 9 KB)

This is fine, except for when I try to submit the diff:

Commit message has errors:

  • Error parsing field "Reviewers": The objects you have listed include objects which do not exist (Spies:).

Version:

phabricator dc9283b85d713a9e8530ae3a3a2007e97da054a3 (Wed, Jun 29)
arcanist 4d4d16f25985f133501f20fdddd183e525f00341 (Tue, Jun 28)
phutil e9e79fd8d7f51c483063d810fdef9750edd27cec (Tue, Jun 28)

Arc version on my server:

arcanist 4d4d16f25985f133501f20fdddd183e525f00341 (28 Jun 2016)
libphutil dde2f74f2793f0216f0d76618ed335a9c802cfec (1 Jul 2016)

Event Timeline

I don't think this is about "magic words" - those are things like Fixes T123. This is about "regular" l18n: Differential is producing a form it can't read.

That's correct, there are no magic words in my diff message. The template I'm given when I create a new diff contains text (I assume because of the Pirate language setting I've chosen) that causes Arcanist/Differential to choke when the message is committed, as "Spies" isn't a valid heading in the Diff message.

We tend to combine similar issues into one so they are resolved at the same time by the same diff. The correct solution I believe here is the same as "magic words". That we need to maintain a list of words that are localized that trigger actions. "Spies" should get picked up and parsed as "Subscribers", just like "Fixin'" should be picked up and parse as "Fixes".

"Spies" should get picked up and parsed as "Subscribers", just like "Fixin'" should be picked up and parse as "Fixes".

Right; I forgot they were both Fields now...

This is sort of very vaguely related to T5586, although I don't know if the fix will end up being the same or not.

The fields we read are not translated:

public function getCommitMessageLabels() {
  return array(
    'CC',
    'CCs',
    'Subscriber',
    'Subscribers',
  );
}

One of the reasons this isn't translated is the same as the reason magic words in T5586 aren't translated: the number of values may differ between languages. For example, a language with several plural forms may want to provide a list like this instead:

Subscriber-one, Subscribers-few, Subscribers-many

If we just put pht() on those strings, languages would be forced to provide exactly as many translations as English does.

The field label we write is translated:

public function getFieldName() {
  return pht('Subscribers');
}

Some other general problems:

  • Maybe commit messages should be generated in a single language in a repository? It seems undesirable to mix Emoji, Spanish, Russian, Pirate, and ALL CAPS in the same repository, for example.
  • When reading a commit message, we have no explicit cues about which language it is written in.
  • It's possible that the same word (say, "Abracadabreur") might identify two different fields in two different languages (say, it means "Subscriber" in Russian and "Reviewer" in German). In this case, we could not parse things unambiguously without knowing the language.
  • If we parse fields in any language, it makes the problems in T11085 worse (we'll get a lot more false positives).

It might be reasonable to un-pht() the field names in the short term until this gets solved for real, although it would be nice to see a use case with a legitimate translation rather than Pirate.