Page MenuHomePhabricator

Update ArcanistMercurialAPI to support getting the current commit ref
ClosedPublic

Authored by cspeckmim on Sat, Sep 4, 4:56 AM.

Details

Summary

Mercurial does not have an implementation for querying commit symbol hardpoints, which is what the "arc amend" workflow uses.

This provides an implementation for Mercurial as well as updating ArcanistMercurialAPI to specify the current working directory symbol as ..

Additionally removed an erroneous early return in ArcanistAmendWorkflow which prevents a check against uncommitted changes.

Fixes T13665

Test Plan
  1. I created a diff on a Mercurial revision.
  2. I updated the revisions summary in phabricator.
  3. I ran arc amend and it successfully amended the local commit with the updated commit message.
  4. I modified a file in the repository and left the change uncommitted.
  5. I ran arc amend and verified that it reported an error due to uncommited commits.

I ran the following commands to verify that they resolved to the correct commits

  1. arc inspect --explore -- 'commit(674492bb460)' properly matched the right commit as a commit hash prefix
  2. arc inspect --explore -- 'commit(674492bb4606666d5321feb38d2a467a8733c786)' properly matched the right commit as a full commit hash
  3. arc inspect --explore -- 'commit(master)' properly matched the right commit as a bookmark
  4. arc inspect --explore -- 'commit(tip)' properly matched the right commit as a tag
  5. arc inspect --explore -- 'commit(.)' properly matched the right commit as the working directory
  6. arc inspect --explore -- 'commit(cafe)' properly matched the right commit as a commit hash prefix
  7. I created a 'cafe' bookmark on a changeset
  8. arc inspect --explore -- 'commit(cafe)' properly matched the right commit as a bookmark
  9. arc inspect --explore -- 'commit(67449)' properly matched the right commit as a revision number
  10. arc inspect --explore -- 'commit(2147483648)' properly did not match any revision (no python exception)
  11. arc inspect --explore -- 'commit(0)' properly matched the first commit

Diff Detail

Repository
rARC Arcanist
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

The desired/intended approach is to implement ArcanistMercurialCommitSymbolCommitHardpointQuery, following ArcanistGitCommitSymbolCommitHardpointQuery. Alas, this is a significantly more complicated change (but might make life easier down the road).

With the simpler approach in this diff, you'll lose out on caching behavior and support for parallel execution via HardpointEngine, which may make workflows run significantly slower. You'll also get different reference/aliasing behavior in Git vs Mercurial (two calls return the same object in Git, and will return different objects in Mercurial), although this likely doesn't matter. Finally, the workflow will break if CommitHardpointQuery is updated to load more data but this method doesn't get a corresponding update.

None of these are really very big deals, but I think doing this the right way isn't so horribly complicated that it justifies doing an end-run around HardpointEngine (unless it turns out to be much more involved than I think).

To implement ArcanistMercurialCommitSymbolCommitHardpointQuery:

  • Copy ArcanistGitCommitSymbolCommitHardpointQuery.
  • Change it to extend ArcanistWorkflowMercurialHardpointQuery.
  • Change loadHardpoint() to execute hg log --template ... --rev 1 --rev 2 --rev 3 ... -- or similar, as an analog to git cat-file --batch-check, but see below.
  • arc liberate, etc.
  • Test narrowly with:
$ arc inspect --explore -- 'commit(.)'
  • arc inspect --explore ... currently throws an exception in Mercurial, but shouldn't. If things are working, the behavior will be more similar to this, in Git:
$ arc inspect --explore -- 'commit(HEAD)'
+ [ArcanistCommitSymbolRef] Commit Symbol "HEAD"
 * [ArcanistScalarHardpoint] ref.symbol.object
  + [ArcanistCommitRef] Commit "82016c00e132c12d761d49756a2fee059f194732"
   * [ArcanistScalarHardpoint] message
    > "Name extension as \"arc-hg\", not \"arg-hg\"\n\nSummary: Ref T13659. I"
   * [ArcanistScalarHardpoint] upstream
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
    > map<string, wild>
  • Then test arc amend.

On "but see below" from hg log ... above, one issue is that hg log --rev 1 --rev 2 ... will fail if any symbol can't be resolved. So if the user is trying to look up 9 valid symbols and 1 bogus symbol, the whole thing will fail. The expected behavior is that the query returns all valid results. Since . will always resolve, you could just punt this with a comment for now. Or you could fall back to individual hg log --rev 1 commands if the big hg log fails, or find some other fancier way to get Mercurial to resolve all valid symbol names from a list of valid and invalid symbol names.

This revision now requires changes to proceed.Sat, Sep 4, 7:39 PM

The desired/intended approach is to implement ArcanistMercurialCommitSymbolCommitHardpointQuery, following ArcanistGitCommitSymbolCommitHardpointQuery. Alas, this is a significantly more complicated change (but might make life easier down the road).

I did have a suspicion about a missing hardpoint query for Mercurial but the only one I was tracing through was HARDPOINT_REVISIONREFS which I think maps to (or, I mapped to) ArcanistMercurialWorkingCopyRevisionHardpointQuery which looks to be appropriately implemented and mostly matches the Git variant. Though I don't have a solid mental model of "hardpoints".

I'll look at making the changes you suggested.

This might or might not be helpful, but the rough model here is that "hardpoints" are an open slot that some particular data may be loaded into. The physical analogy is a "hardpoint" on fighter aircraft where a particular missile or bomb may be attached.

A similar concept is common in Phabricator today, where data is "attachable". For example, a Commit has methods like attachRepository() and getRepository() (which throws if a Repository is not attached), and some Query classes have methods like needSuchAndSuch(true), meaning the query should also load and attach some "SuchAndSuch" related object as it queries the database. Hardpoints are partly a more generalized formalization of "attachment" points.

In this case, the object is a CommitSymbolRef, which represents a symbol in the sense of "string of characters we got from somewhere", like "HEAD^" or "1234" or "ab3fern4" or ".", and has a "hardpoint" for a CommitRef.

The CommitRef has some actual details about the commit, e.g., its full identity hash in the underlying VCS. So, fully attached, the object tree might look like this:

- `CommitSymbolRef` (The symbol "HEAD".)
  - `CommitRef` (An object with a property like `fullVCSHash = abcd1234`).

So you build a list of CommitSymbolRef objects ("HEAD", "master") and hand the list to HardpointEngine, telling it to load in the data on the "COMMIT" hardpoint. This is just 35 layers of abstraction on top of this:

foreach ($symbols as $symbol) {
  $hash = `hg log --rev $symbol`;
}

The justification for the 35 layers of abstraction is:

  • HardpointEngine has a significant ability to do what might generally be called "query planning" and may be able to get you the information many times faster than the bare loop by building a graph of Futures and using yield, caching, and executing all possible parallel operations in parallel; and
  • HardpointEngine can load hardpoints in a completely modular way, so extensions can replace or supplement query behavior very flexibly.
  • Hardpoints have 175 extra layers of abstraction and are themselves introspect-able, mutable, extensible, etc.

Today, this is a lot of complexity for little real effect. In theory, it can solve some hard problems elsewhere in Arcanist and eventually generalize to Phabricator, although it's a bit difficult to imagine that ever actually happening.

Provide implementation of ArcanistMercurialCommitSymbolCommitHardpointQuery to enable hardpoint queries for commit symbols.

Not yet handling the case where an invalid symbol results in failing to lookup any symbols

Not yet handling the case where an invalid symbol results in failing to lookup any symbols

I didn't see any obvious way of doing this through revsets but I'll try some more experiments. An extension that adds a template keyword, or just a new command for this, might also be an option

Hmm could maybe do something like hg log --rev 'bookmark() or tag() or .' to just get every commit that has a potential symbol and then process those. It would end up getting more results than necessary in every case. The query runs relatively quickly (~230ms) on our ~128k commit repository but it only has ~550 tags which I'm guessing is low -- if we tagged every build we'd have ~15k.

Ooh actually it looks like this will work

$ hg log --revset "bookmark('re:^symbol$') or tag('re:^symbol$')"

Doing that doesn't fail if the symbol doesn't exist

Update the revsets used to work around potentially invalid symbols from preventing all symbols from resolving

More-correct formatting for the revset

The "symbol" here could, in the absolute-most-general case, be a short commit hash (abcd1234) or symbolic commit (tip^^^) -- not just a bookmark or tag name -- so I think this trick (though quite clever) may run into trouble some day. This looks like a reasonable fix to the immediate issue, though, and we can cross this bridge if/when we come to it.

(Now that you bring it up, moving the logic to an extension as the desired eventual end state seems pretty attractive to me, too.)

This revision is now accepted and ready to land.Sun, Sep 5, 3:29 PM

The "symbol" here could, in the absolute-most-general case, be a short commit hash (abcd1234) or symbolic commit (tip^^^) -- not just a bookmark or tag name -- so I think this trick (though quite clever) may run into trouble some day.

Oh yea... hmm I think it's possible to include support for commit hashes (or prefixes) as well as revision ID -- I'll try out a change real quick to see if it's simple to include support for those symbols at least. Any relativity modifiers ^^ I think are effectively impossible to handle without resorting to individual hg log commands or an extension

Include support for revision number and hash-prefix symbols

Fix not being able to lookup commit by symbol 0

Don't bother trying to lookup bookmark/tag names when the symbol is a number as Mercurial disallows this. Also fix the revision number mapping which somehow worked? But still does now...

I played around with this some more and this feels pretty solid for resolving most common symbols. I'll poke around some of the mercurial source a bit later this week and see if I can find an easy way for resolving revset/symbols easily though a custom extension.

src/ref/commit/ArcanistMercurialCommitSymbolCommitHardpointQuery.php
162

oof this is php8-only. I'll get a fix up that uses strpos() instead.