Page MenuHomePhabricator

In Diffusion UI, satisfy "Refs" query from the database, not the API
Open, NormalPublic

Description

See PHI1992. A hosted instance with an unusual access pattern overwhelmed repository shard resources with a modest set of requests. A likely source of load was --contains queries.

This issue is entangled with a larger set of problems associated with T9898, but one narrow issue is immediately fixable:

DiffusionCommitController issues a diffusion.refsquery, which is an API call that selects refs which point at a commit. This originally dates from T1130, which didn't actually care about refs, only tags and branches. Those are shown separately in the modern UI, but the "Tags" and "Branches" field show tags and branches which contain the commit.

The ideal UI here is probably something like:

Refs: [:] [branch] master
      [*] [tag] release-1.2.3
      [*] [ref] refs/pull/123
      [*] [ref] refs/epriestley/temp-123

...where [*] is an icon that means "points-at", [:] is an icon that means "contains", and [tag], [branch], and [ref] are icons that mean those things. Thus, the UI would show all refs which contain or point at the commit in a single view.

Because of the performance implications of --contains queries (see T9898), this isn't entirely straightforward to switch to. Currently, we get this information by running:

# Branches (Contains)
$ git branch --verbose --no-abbrev --contains <commit> -- <possible patterns>

# Tags (Contains)
$ git tag -l --contains <commit> --

# Refs (Points-At)
$ git log -n 1 --format=%d <commit> --

These are sort of silly, particularly the "refs" query. In theory, this can satisfy all of them in one request:

$ git for-each-ref [--contains <commit> | --points-at <commit>] -- <possible patterns>

However, in the specific case referenced by PHI1992, git tag -l is substantially faster than git for-each-ref!

$ time git for-each-ref --contains <commit> -- refs/tags > /dev/null

real	0m0.872s
...
$ time git tag -l --contains <commit> -- > /dev/null

real	0m0.135s
...

There's some variance here, but the fastest git for-each-ref invocation is never faster than the slowest git tag -l invocation, i.e. git tag is always better, and often several times better.

This seems like total nonsense (and maybe fixed in a newer version of Git, the host currently has git version 2.29.0). But, without a deeper understanding of the issue, I'm very hesitant to switch from git tag -l to git for-each-ref.

Still, there's a smaller step which can be made easily here. This is certainly silly:

$ git log -n 1 --format=%d <commit> --

This can be expressed as git for-each-ref --points-at <commit> --, and both commands are quick. But, in modern Phabricator, it can also be expressed by querying the RefCursor table instead of hitting the API at all.

Event Timeline

epriestley triaged this task as Normal priority.Feb 15 2021, 8:17 PM
epriestley created this task.