Page MenuHomePhabricator

Implement a git blame cache
ClosedPublic

Authored by epriestley on Sep 28 2014, 3:58 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Dec 18, 2:58 PM
Unknown Object (File)
Tue, Dec 17, 8:39 AM
Unknown Object (File)
Mon, Dec 16, 7:33 PM
Unknown Object (File)
Tue, Dec 10, 7:38 PM
Unknown Object (File)
Mon, Dec 9, 1:03 AM
Unknown Object (File)
Fri, Dec 6, 3:07 AM
Unknown Object (File)
Thu, Dec 5, 7:18 PM
Unknown Object (File)
Thu, Dec 5, 6:59 PM
Tokens
"Love" token, awarded by fabe.

Details

Summary

Ref T2450. Ref T2453. Add a repository_blamecache table and cache git blame information

Test Plan

View files in Diffusion with enabled blame

Diff Detail

Repository
rP Phabricator
Branch
arcpatch-D10600_1
Lint
Lint Passed
Unit
Test Failures
Build Status
Buildable 3311
Build 3318: [Placeholder Plan] Wait for 30 Seconds

Unit TestsFailed

TimeTest
0 mstestCelerityMaps
0 mstestApplicationsInstalled
0 mstestAuthorName
0 mstestBranchFilter
0 mstestConduitMethods
View Full Test Results (1 Failed · 17 Passed)

Event Timeline

fabe retitled this revision from to Implement a git blame cache as described in T2450.
fabe updated this object.
fabe edited the test plan for this revision. (Show Details)
fabe edited edge metadata.
  • use commitIDs instead of the identifiers for the cache

When i look at the blame xhr request in the dark console i can still see a query selecting all the commits by their identifiers instead of the id's (which take a long time).
I suspect this is for showing the connected revisions to the commits. I couldn't quite find where that is executed. Maybe someone could push me to the right file?
However changing this behavior might need an conduit api change (adding an extra field with the revision information)

  • moar performance improvement

blame for phutil_map now takes ~6s instead of ~20s
There's still some weird stuff going on with $handlers / $author_phids in the controller but that won't affect performance

  • add new db schema and key config
  • populate blame cache on reparse
  • replaced some code with an already existing conduit call
  • add new db schema config
  • populate blame cache on reparse
  • replaced some code with an already existing conduit call
  • rebasing and fix db validity issues
  • add new db schema config
  • populate blame cache on reparse
  • replaced some code with an already existing conduit call
  • rebasing and fix db validity issues
  • reabsing
joshuaspence retitled this revision from Implement a git blame cache as described in T2450 to Implement a git blame cache.Feb 25 2015, 10:02 AM
joshuaspence updated this object.

rebase and fix visibility

epriestley added a reviewer: fabe.

I'm finally doing some Diffusion work, let me see if I can bring this the last mile. Thanks for the patch!

Sure :)
I remember that the biggest performance gain was coming from using PHIDs when selecting all the commit info in the FileController instead of the commit identifiers.
Showing the blame info on the library_map file was a good benchmark.

Hmm, interesting. I'll look at that, I wouldn't have expected it.

You were right, MySQL was running the old query really inefficiently. I think D14962 should fix it.

epriestley edited edge metadata.
  • Use lastmodifiedquery (which is a little more tailored/efficient) instead of historyquery.
  • Use standard cache infrastructure (this is slightly less efficient but much simpler and feels like a good place to start).
  • Don't try to do the reparse/prefill stuff yet -- we could pursue that later.

With other optimizations, this drops __phutil_library_map__.php in blame + no color mode from about 3.5 seconds to about 800ms (when we hit cache) for me locally.

chad edited edge metadata.
This revision is now accepted and ready to land.Jan 7 2016, 1:57 AM
This revision was automatically updated to reflect the committed changes.