Page MenuHomePhabricator

Add an optional pht() callback for collecting string frequencies
ClosedPublic

Authored by epriestley on Nov 8 2016, 2:49 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Mar 4, 7:40 AM
Unknown Object (File)
Mon, Mar 4, 7:40 AM
Unknown Object (File)
Mon, Mar 4, 7:40 AM
Unknown Object (File)
Sat, Mar 2, 4:56 AM
Unknown Object (File)
Feb 26 2024, 3:49 AM
Unknown Object (File)
Feb 9 2024, 5:15 AM
Unknown Object (File)
Feb 3 2024, 12:45 PM
Unknown Object (File)
Jan 29 2024, 1:51 PM
Subscribers
None

Details

Summary

Ref T5267. For translations, it's useful to have frequency data so the most frequently used strings can be prioritized for translation -- common UI strings are much more important to translate than big blocks of setup/configuration text or obscure error messages.

Add an optional callback so we can collect which strings are actually translated at runtime and build a dataset for prioritizing strings for translation.

Test Plan

Applied this simple hack to phabricator/:

diff --git a/src/infrastructure/env/PhabricatorEnv.php b/src/infrastructure/env/PhabricatorEnv.php
index 31fd116..9e08a21 100644
--- a/src/infrastructure/env/PhabricatorEnv.php
+++ b/src/infrastructure/env/PhabricatorEnv.php
@@ -59,6 +59,8 @@ final class PhabricatorEnv extends Phobject {
   private static $readOnly;
   private static $readOnlyReason;
 
+  public static $frequency = array();
+
   const READONLY_CONFIG = 'config';
   const READONLY_UNREACHABLE = 'unreachable';
   const READONLY_SEVERED = 'severed';
@@ -166,7 +168,8 @@ final class PhabricatorEnv extends Phobject {
 
       PhutilTranslator::getInstance()
         ->setLocale($locale)
-        ->setTranslations($override + $translations);
+        ->setTranslations($override + $translations)
+        ->setWillTranslateCallback('PhabricatorEnv::willTranslate');
 
       self::$localeCode = $locale_code;
     } catch (Exception $ex) {
@@ -174,6 +177,13 @@ final class PhabricatorEnv extends Phobject {
     }
   }
 
+  public static function willTranslate($text) {
+    if (empty(self::$frequency[$text])) {
+      self::$frequency[$text] = 0;
+    }
+    self::$frequency[$text]++;
+  }
+
   private static function buildConfigurationSourceStack($config_optional) {
     self::dropConfigCache();
 
diff --git a/webroot/index.php b/webroot/index.php
index 59e5b71..d57dd35 100644
--- a/webroot/index.php
+++ b/webroot/index.php
@@ -15,6 +15,14 @@ try {
   try {
     PhabricatorStartup::beginStartupPhase('run');
     AphrontApplicationConfiguration::runHTTPRequest($sink);
+
+    $freq = PhabricatorEnv::$frequency;
+    asort($freq);
+    Filesystem::writeFile(
+      '/tmp/pht_frequency.json',
+      id(new PhutilJSON())
+        ->encodeFormatted($freq));
+
   } catch (Exception $ex) {
     try {
       $response = new AphrontUnhandledExceptionResponse();

Got a datafile like this:

{
  "Piece of Eight": 1,
  "Haypence": 1,
  "Disable DarkConsole": 1,
  "Yellow Medal": 1,
  "Doubloon": 1,
  "Mountain of Wealth": 1,
  "Baby Tequila": 1,
  "Evil Spooky Haunted Tree": 1,
  "Pterodactyl": 1,
  ...

Diff Detail

Repository
rPHU libphutil
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

epriestley retitled this revision from to Add an optional pht() callback for collecting string frequencies.
epriestley updated this object.
epriestley edited the test plan for this revision. (Show Details)
epriestley added a reviewer: chad.

We sampled this at FB like multimeter, to see which showed up the most in real usage.

Yeah, I'm planning to do something similar here, although I'll probably just run it at 100% on this install for a week or something to start with. I don't think we'll get much better/different data by doing something super complicated with the Phacility cluster, although we could do that eventually.

That is, I'll sample in phabricator/ on a per-request basis like this:

if (mt_rand() % 12345 == 0) {
  $translator-> setWillTranslateCallback(...);
}

If we sample in libphutil/, we'd have a bunch of overhead on every call to pht().

chad edited edge metadata.
This revision is now accepted and ready to land.Nov 8 2016, 3:18 PM
This revision was automatically updated to reflect the committed changes.