Fixes T4414. Currently, when we discover a new repository, we do something like this:
foreach (branch) { foreach (commit on this branch) { do_something(); } }
In cases where there are a lot of branches which mostly just branch master, this leads to us doing roughly O(branches * commits) work.
We have a commitCache to prevent this, but it has two problems:
- It only fills out of the DB, and we do this whole thing before writing to the DB, which is the entire point.
- It has a fixed size (2048) and on initial discovery we're usually dealing with way more commits than that.
Instead, also stop doing work if we hit a commit which is known already.