Page MenuHomePhabricator

Add Badges to UserCache
ClosedPublic

Authored by chad on Mar 16 2017, 3:18 AM.
Tags
None
Referenced Files
F14082716: D17503.diff
Sat, Nov 23, 12:58 AM
Unknown Object (File)
Wed, Nov 20, 3:28 AM
Unknown Object (File)
Tue, Nov 19, 4:20 PM
Unknown Object (File)
Sat, Nov 16, 9:14 PM
Unknown Object (File)
Sat, Nov 16, 9:53 AM
Unknown Object (File)
Thu, Nov 14, 7:13 AM
Unknown Object (File)
Wed, Nov 13, 12:08 AM
Unknown Object (File)
Tue, Nov 12, 6:08 PM
Subscribers

Details

Summary

Ref T12270. Builds out a BadgeCache for PhabricatorUser, primarily for Timeline, potentially feed? This should still work if we later let people pick which two, just switch query in BadgeCache.

Test Plan

Give out badges, test timeline for displaying badges from handles and without queries. Revoke a badge, see cache change.

Diff Detail

Repository
rP Phabricator
Branch
badges-cache (branched from master)
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 16011
Build 21230: Run Core Tests
Build 21229: arc lint + arc unit

Event Timeline

  • try to clear cache on new/revoke ... doesn't work?

This is written correctly, minus one cache issue.

I'm slightly unsure about the performance tradeoffs. If we wanted to put badges on Hovecards again we can already use AwardQuery directly and get reasonable performance (although Hovercards support bulk generation, I believe we never actually do this today, except in the test UI), and I can't immediately come up with any other cases where we'd want to use them. Did you have potential ideas offhand beyond Hovercards? Basically, I'm worried we may be slowing down every page a little bit (even if Badges is uninstalled) since we load user handles pretty much everywhere, but only making timeline rendering better, and if there's only one "real" use case we'd be better off with a more narrow fix that doesn't slow down Handles everywhere else.

But we can also just do this for now and then revisit it after T9006 and see if it still makes sense, since it should be pretty easy to throw this cache away if it isn't seeing use.


I think there's a cache issue if you do this:

  • Award badge "Niec Hat" to Alice because she has a nice hat.
  • View a comment by Alice (so the cache fills), mouse over her badge, notice that you misspelled it.
  • Fix the badge spelling and pick a different icon.
  • View the comment again. Badge is still misspelled and uses the old icon, because the spelling and icon are cached.

You can fix this by storing only PHIDs (which never change) or by calling PhabricatorUserCache::clearCacheForAllUsers() after any edit to any badge. The latter is simpler, but gives us more cases where pages are really slow because they're doing a huge cache fill on hundreds of handles.

src/applications/people/cache/PhabricatorUserBadgesCacheType.php
55

Prefer phutil_json_encode(), which throws if the input is somehow invalid instead of emitting a warning and returning false.

This revision now requires changes to proceed.Mar 16 2017, 5:55 PM

The only other solution I could think of for Timeline was reducing the query to a tighter PhabricatorUser query instead of an overbroad BadgeAward query, if we don't want this on handles.

At least, querying a handful of Users seems cheaper than unlimited badge awards.

Hrrm, let me check if a huge ... UNION ALL ... query can work for now -- it's dumb but might be fine. And badges are always visible to everyone so we don't have to worry about weird policy stuff.

Something like this seems to "work":

diff --git a/src/view/phui/PHUITimelineView.php b/src/view/phui/PHUITimelineView.php
index 57b07a94bf..2c263e2fe9 100644
--- a/src/view/phui/PHUITimelineView.php
+++ b/src/view/phui/PHUITimelineView.php
@@ -243,30 +243,69 @@ final class PHUITimelineView extends AphrontView {
       return;
     }
 
+    // This is an unusual query: we want to select a set number of awards for
+    // each of a potentially large number of users. Query classes don't
+    // normally support this kind of query. Get the results in a single query
+    // by merging a lot of individual queries using UNION ALL.
+
+    $award_table = new PhabricatorBadgesAward();
+    $badge_table = new PhabricatorBadgesBadge();
+    $conn = $award_table->establishConnection('r');
+
+    $limit = 2;
+
+    $parts = array();
+    $idx = 0;
+    foreach ($user_phids as $user_phid) {
+      $parts[] = qsprintf(
+        $conn,
+        'SELECT * FROM (
+          SELECT award.recipientPHID, award.badgePHID
+            FROM %T award JOIN %T badge
+            ON award.badgePHID = badge.phid
+            WHERE award.recipientPHID = %s AND badge.status IN (%Ls)
+            ORDER BY award.id DESC LIMIT %d) awards_%d',
+        $award_table->getTableName(),
+        $badge_table->getTableName(),
+        $user_phid,
+        array(
+          PhabricatorBadgesBadge::STATUS_ACTIVE,
+        ),
+        $limit,
+        $idx++);
+    }
 
-    $awards = id(new PhabricatorBadgesAwardQuery())
-      ->setViewer($this->getViewer())
-      ->withRecipientPHIDs($user_phids)
-      ->withBadgeStatuses(array(PhabricatorBadgesBadge::STATUS_ACTIVE))
-      ->execute();
+    $rows = queryfx(
+      $conn,
+      '%Q',
+      implode(' UNION ALL ', $parts));
+
+    if (!$rows) {
+      return;
+    }
 
-    $awards = mgroup($awards, 'getRecipientPHID');
+    $badge_phids = array();
+    $badge_map = array();
+    foreach ($rows as $row) {
+      $recipient_phid = $row['recipientPHID'];
+      $badge_phid = $row['badgePHID'];
 
-    foreach ($events as $event) {
+      $badge_phids[$badge_phid] = $badge_phid;
+      $badge_map[$recipient_phid][] = $badge_phid;
+    }
 
-      $author_awards = idx($awards, $event->getAuthorPHID(), array());
+    $badges = id(new PhabricatorBadgesQuery())
+      ->setViewer($viewer)
+      ->withPHIDs($badge_phids)
+      ->execute();
+    $badges = mpull($badges, null, 'getPHID');
 
-      $badges = array();
-      foreach ($author_awards as $award) {
-        $badge = $award->getBadge();
-        $badges[$award->getBadgePHID()] = $badge;
-      }
+    foreach ($events as $event) {
 
-      // TODO: Pick the "best" badges in some smart way. For now, just pick
-      // the first two.
-      $badges = array_slice($badges, 0, 2);
+      $author_awards = idx($badge_map, $event->getAuthorPHID(), array());
+      $author_badges = array_select_keys($badges, $author_awards);
 
-      foreach ($badges as $badge) {
+      foreach ($author_badges as $badge) {
         $badge_view = id(new PHUIBadgeMiniView())
           ->setIcon($badge->getIcon())
           ->setQuality($badge->getQuality())

...which produces queries which look like this:

SELECT * FROM (
  SELECT award.recipientPHID, award.badgePHID
    FROM `badges_award` award
    JOIN `badges_badge` badge ON award.badgePHID = badge.phid
    WHERE award.recipientPHID = 'PHID-USER-cvfydnwadpdj7vdon36z' AND badge.status IN ('open')
    ORDER BY award.id DESC LIMIT 2) awards_0
UNION ALL
SELECT * FROM (
  SELECT award.recipientPHID, award.badgePHID
    FROM `badges_award` award
    JOIN `badges_badge` badge ON award.badgePHID = badge.phid
    WHERE award.recipientPHID = 'PHID-USER-6otqc3truc7u43cy2i34' AND badge.status IN ('open')
    ORDER BY award.id DESC LIMIT 2) awards_1

However, I'm pretty sure that will hit the "maximum number of joins in a single query" limit at 32 or 64 users, and it's gross.

The way to do this with T9006 would be to create a new table, like PhabricatorBadgesDisplay, which would have columns like this:

  • recipientPHID
  • badgePHID
  • awardPHID
  • order

Then write to that table when a badge is awarded, revoked, disabled, enabled, or (in the future) a user reorders their badges. We only write the maximum number of rows that we'd ever use for display, and number them 1, 2, 3, 4, 5 in order.

Then the timeline query is:

  • SELECT * FROM badge_display WHERE recipientPHID IN (%Ls) AND order <= 2

If we want to show 5 on hovercards, we:

  • SELECT * FROM badge_display WHERE recipientPHID IN (%Ls) AND order <= 5

The logic to update that table -- especially for archiving/enabling a badge -- would be kind of a pain, but it's OK for it to be a bit messier/slower since it won't be happening all over the place, only when edits are made.

I'm also basically OK with just unbeta'ing without fixing this and dealing with it later -- it will take a while for any install to accumulate enough awards that this is an issue, and I think T9006 is a reasonable pathway forward when it does become one.

You don't want to use UserCache at all?

I'm okay with either approach.

The code in HEAD will eventually hit a scalability issue, but it's easy to understand and measure and well-contained. The user cache will scale better, but it's more complicated, and harder to measure and understand (the costs are spread out across every page, and some pages pay heavy costs to do big cache fills while others don't). If we had T9006, I'd prefer a dedicated table to the user cache, since I think that's the best of both worlds (scalable and self-contained).

If we do the user cache, we should fix the name/icon caching issue so the behavior of the cache is correct, though.

So up to you if you want to move forward and fix the cache behavior, or just punt this until T9006 happens. Or some version of that "UNION ALL" mess would probably work, but I'd basically rather just implement T9006 now than bring that upstream since it's so complicated and gross.

I can walk it off the Handle, but leave it on PhabricatorUser as a cache. This will let us not need weird award queries and we can swap in the new table when that's built.

or is having it on PhabricatorUser also a performance hit globally

Oh, that's a good compromise. UserCaches merely existing doesn't impact performance, so using UserCache but not Handles addresses my concerns.

maybe you should be CTO with that kind of thinking

chad edited edge metadata.
  • try to clear cache on new/revoke ... doesn't work?
  • Revert to just user cache, no handles

applyFinalEffects still doesn't seem to kick in here, should it be in the editor itself?

I guess it should be moved up anyways, in case of name change, archiving, etc.

  • fix cache resetting on badge transactions
chad edited the summary of this revision. (Show Details)

I think this is pretty close.

src/applications/badges/editor/PhabricatorBadgesEditor.php
127–141 ↗(On Diff #42105)

This block overwrites $user_phids for each transaction, but should add them to a list instead.

It's possible to apply transactions of both types at the same time (for example, in Conduit, apply both an "award" transaction and a "name" transaction in one request, or "award" one user and "revoke" another at the same time).

If you do, whichever one applies later wins, and the $user_phids for the other transaction get overwritten. If you award to "A" and revoke from "B" at the same time, only "A" or "B" will get their cache cleared.

143–147 ↗(On Diff #42105)

Prefer PhabricatorUserCache::clearCaches($key, $user_phids) which can execute in one query instead of one query per affected user.

src/view/phui/PHUITimelineView.php
255

It's possible that $users[$user_phid] will not exist. For example:

  • Write a comment as user @alice.
  • Destroy @alice with bin/remove destroy @alice.
  • View the object that @alice commented on.

We should just skip the user in this case, not fatal.

This revision now requires changes to proceed.Mar 17 2017, 11:38 AM
chad edited edge metadata.
  • fix arrays?

I am bad with arrays, so feel free to point out the right way to merge two and de-dupe keys.

I spent like 30 minutes trying to figure out why cache was failing and it was from me not using clearCaches instead of clearCache........ ....... :(

This gets award + name right but I think not award + revoke. I'd probably write it like this:

$user_phids = array();
$clear_everything = false;

foreach ($xactions as $xaction) {
  switch ($xaction->getTransactionType()) {
    case PhabricatorBadgesBadgeAwardTransaction::TRANSACTIONTYPE:
    case PhabricatorBadgesBadgeRevokeTransaction::TRANSACTIONTYPE:
      foreach ($xaction->getNewValue() as $user_phid) {
        $user_phids[] = $user_phid;
      }
      break;
    default:
      $clear_everything = true;
      break;
  }
}

if ($clear_everything) {
  $awards = id(new PhabricatorBadgesAwardQuery())
    ->setViewer($this->getActor())
    ->withBadgePHIDs(array($badge_phid))
    ->execute();
  foreach ($awards as $award) {
    $user_phids[] = $award->getRecipientPHID();
  }
}

if ($user_phids) {
  PhabricatorUserCache::clearCaches(
    PhabricatorUserBadgesCacheType::KEY_BADGES,
    $user_phids);
}

While writing this, I moved the "default" case below the switch because I realized that we'll do multiple AwardQuery calls otherwise (for example, if you update name + icon + rarity, we load all awards three times if it's inside the loop).

In most cases, you don't need to go out of your way to make sure stuff is unique. Here, not bothering to make sure results are unique probably only ends up with a couple extra PHIDs in the eventual DELETE * FROM ... WHERE userPHID IN (%Ls) query, which isn't a big deal.

If you did want to make sure they're unique, you could replace the $user_phids[] = $some_user_phid; lines with $user_phids[$some_user_phid] = $some_user_phid;, or do $user_phids = array_unique($user_phids); at the end, but I usually don't bother with this since it's rarely important and makes the code a bit more complicated.

  • cut and paste evans code

haha okay

well this is obviously flawless now, then

This revision is now accepted and ready to land.Mar 17 2017, 5:37 PM

i still had to set badgePHID

This revision was automatically updated to reflect the committed changes.

You still have 20k awards in your test install?

I think I have 20K badges but only like 100 awards.

bin/lipsum generate badges.award

BTW the email not verified status thing is a little strange. What prompted the change? I assumed this dot was always for Calendar.

Prior to T12268, a grey dot meant either "disabled" (since D1429, in January 2012) or "unapproved" (since D7576, in November 2013).

I think the primary product role of the dot is to indicate "this user may not respond". All of the causes of the dot support this, with different levels of severity: they might not respond because their account has been disabled, or hasn't been approved yet, or they're away on vacation, or they're in a meeting. After the email verification stuff, they might not respond because they're not receiving email.

I think we see the grey dots far more than real installs do, since we have test accounts locally and a lot of inactive installs on this install.

Ah yeah, thanks for the color. I’ll think about a better UI treatment maybe.