Page MenuHomePhabricator

Improve performance of repository discovery in repositories with >65K refs
ClosedPublic

Authored by epriestley on Jan 26 2021, 7:10 PM.
Tags
None
Referenced Files
F13096210: D21521.diff
Thu, Apr 25, 3:18 PM
Unknown Object (File)
Sat, Apr 20, 5:31 PM
Unknown Object (File)
Thu, Mar 28, 3:32 PM
Unknown Object (File)
Mar 15 2024, 3:26 PM
Unknown Object (File)
Feb 19 2024, 5:01 AM
Unknown Object (File)
Feb 7 2024, 5:52 AM
Unknown Object (File)
Jan 19 2024, 4:52 PM
Unknown Object (File)
Jan 15 2024, 5:21 PM
Subscribers
None

Details

Summary

Ref T13593. The commit cache in this Engine has a maximum fixed size (currently 65,535 entries).

If we execute discovery in a repository with more refs than this (e.g., 180K), we get fast lookups for the first 65,535 refs and slow lookups for the remaining refs.

Instead, divide the refs into chunks no larger than the cache size, and perform an explicit cache fill before each chunk is processed.

Test Plan
  • Created a repository with 1K refs. Set cache size to 256. Ran discovery.
    • Before patch: saw one large cache fill and then ~750 single-gets.
    • After patch: saw four large cache fills.
  • Compared bin/repository discover ... --verbose output before and after patch for overall effect; saw no differences.

Diff Detail

Repository
rP Phabricator
Branch
discovery1
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 25037
Build 34545: Run Core Tests
Build 34544: arc lint + arc unit