Page MenuHomePhabricator

Improve use of keys when rebuilding repository summary table
AbandonedPublic

Authored by epriestley on Sep 4 2018, 9:45 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Nov 27, 4:29 PM
Unknown Object (File)
Fri, Nov 22, 4:54 PM
Unknown Object (File)
Fri, Nov 22, 7:35 AM
Unknown Object (File)
Oct 29 2024, 4:29 AM
Unknown Object (File)
Oct 13 2024, 10:02 PM
Unknown Object (File)
Oct 13 2024, 9:41 PM
Unknown Object (File)
Oct 9 2024, 10:03 AM
Unknown Object (File)
Oct 2 2024, 6:53 AM
Subscribers
Restricted Owners Package

Details

Summary

See PHI842. Ref T13195. If we discover commits have been removed from the repository, we rebuild the entire summary table.

This currently uses a query with two MAX(...) fields on different columns, and an importStatus & flag which can't be satisfied with a key.

Instead:

  • Denormalize unreachability into a separate, keyable column.
  • Add appropriate keys.
  • Execute the query as two halves with UNION ALL.

Also, remove an ancient rebuild_summaries.php script which has a copy of this logic and isn't referenced anywhere. This should become bin/repository <something> if we have some need for it. It currently gets the wrong result (it isn't aware of unreachable commits).

Test Plan
  • Changed the discovery engine to force the summary to always rebuild.
  • Ran bin/repository update ... --trace to see the rebuild happen.
  • Verified rebuild comes out properly.
  • Used EXPLAIN on the new query and saw a better (entirely key-based) query plan.

Diff Detail

Repository
rP Phabricator
Branch
unreach1
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 20740
Build 28199: Run Core Tests
Build 28198: arc lint + arc unit

Event Timeline

Owners added a subscriber: Restricted Owners Package.Sep 4 2018, 9:45 PM

I'm not thrilled about this change since it requires us to do a bunch of ALTER TABLE to commit to improve a very edge-case query which only takes ~1 second to run even in the bad case. The outcome is probably better, sort of? And rebuild_summaries.php should get nuked no matter what. But I'm not sure I'll actually land this.

  • Actually remove the "rebuild_summaries.php" script.
  • (But, I'm going to just do that part and not the rest of it for now.)

At least for now, let's just do D19643 (remove the outdated script). It sounds like the rest of this runs very rarely and probably isn't worth all these schema changes to optimize. We can keep an eye out for other queries which would benefit, or other motivations.