Page MenuHomePhabricator

Update templates used with mercurial to remove '--debug'
ClosedPublic

Authored by cspeckmim on Wed, Jul 7, 1:00 AM.

Details

Summary

Refer to discussion on D21677#275541

Refs D21681 (arcanist changes)

Phabricator has several uses of the --debug flag being used with Mercurial. Use of this flag causes additional output to be added which Phabricator needs, however the behavior of --debug is not guaranteed to be stable, and in newer versions of Mercurial there have been additional output that has caused Phabricator to choke on parsing the output. This change removes several uses of --debug in favor of using --template with the hg log or hg annotate commands in combination with the {p1.node} or {p2.node} template format.

The use of {p1node} format in templates was added in Mercurial 2.4 (2012). This format was deprecated in Mercurial 4.9 (2019) in favor of using {p1.node} format which is unclear when this new format was added (presumably earlier than Mercurial 4.9).

The use of --template with hg annotate is only officially supported in Mercurial 4.6 (2018), though does appear to work in 4.5 but is not documented.

Since the {p1node} format was introduced in 2.4 this bumps the required version of hg to 2.4 (from 1.9). Since the annotate --template feature wasn't added until 4.6 (which is still fairly recent), the use of it is gated on a capability test, but still preferred for use where possible to avoid extraneous output from --debug flag.

Test Plan

I verified I could do the following in a mercurial repository, while having mercurial 5.8 installed:

  1. Navigate and view files in Diffusion under e.g. /source/test-repo/.
  2. While viewing a file in Diffusion verified that I could view the blame of the file and the history/annotations looked accurate for the files I was browsing.
  3. From the blame sidebar, select to view a commit which loaded and displayed changes properly.
  4. View the history of the repository under e.g. /source/test-repo/history/default/. I verified the history looked correct and the tree-like structure showing relationship of commits also looked accurate.

I setup mercurial to run version 4.4, created a new repository, added some commits, and verified all the above behavior still works properly.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
cspeckmim edited the summary of this revision. (Show Details)

Updated the required version from 4.5 to 4.6 which is the officially supported version that allows hg annotate --template

be consistent with using --template %s instead of --template=%s

cspeckmim edited the test plan for this revision. (Show Details)

Okay I have these changes in and verified that things seem to be working fine with mercurial 5.8, though I'm not sure your thoughts on requiring mercurial 4.6 as the minimum version. If we restore the hg annotate --debug then the minimum version required would only jump to 2.4.

epriestley requested changes to this revision.EditedThu, Jul 8, 6:14 PM

Ah, I didn't realize annotate --template arrived so recently. Phacility production has Mercurial 2.8.2 and I can't easily fix that at the moment, so I don't want to bring version bump all the way to 4.6 upstream today. I'm okay with the jump to 2.4 upstream, and also okay with leaving annotate --debug without filtering upstream (since no one has complained in all this time, and not addressing it doesn't make things worse). So I'd propose:

  1. Bring the 2.4 version bump upstream, without directly addressing annotate --debug.
  2. Deal with annotate --debug by either:
    1. bumping the minimum version to 4.6 in your fork and using --template; and/or
    2. defining another Mercurial version capability test for annotate --template and selecting that in versions of Mercurial which support it (this can come upstream).

Your call on whether you want to do the legwork on (2B).

The rest of this looks generally good to me, couple of minor thoughts inline but the 4.6 bump is the only real issue.

src/applications/diffusion/conduit/DiffusionHistoryQueryConduitAPIMethod.php
154–156

The loop below might become more readable now if the pattern is just {node} {p1node} {p2node} with a single explode on spaces, but this is a quibble.

164

Not your doing, but %C ("raw, unescaped comment") is unsafe in the general case. This particular use is safe, but it's vaguely desirable to use only as much %C as necessary, and it isn't strictly necessary here.

When %Ls was first implemented, it did not accept empty lists, so it could not be used in this situation. It was later updated to emit nothing if the value passed is an empty array, so this could be rewritten more safely to use %Ls now:

$path_args = array();
if (strlen($path)) {
  $path_args[] = $path;
}

// ...

'log ... -- %Ls',
...,
$path_arg);
173–174

(Per below, this is slightly cleaner as \z.)

src/applications/diffusion/query/blame/DiffusionMercurialBlameQuery.php
10

I had a lot of difficulty figuring this out, too. I think that in the x % 'y' syntax, {z} means {current item being iterated.z} or something like that, and the existence of line is described in hg help --verbose annotate:

And each entry of "{lines}" provides the following sub-keywords in
addition to "{date}", "{node}", "{rev}", "{user}", etc.

line          String. Line content.
lineno        Integer. Line number at that revision.
path          String. Repository-absolute path of the file at that
              revision.

So I think this is just kind of hard-to-observe, not actually magical? But it took me literally 20 tries to figure out {lines} % '{node}' was the right basic syntax.

src/applications/diffusion/query/lowlevel/DiffusionLowLevelParentsQuery.php
68

This is extremely minor, but prefer \z (end of input) over $ (end of input, or newline and then end of input) when you do not expect to match a newline.

This almost never matters, but using \z is more precise, and using $ can occasionally cause input validation issues. I think it once caused a minor-but-material bug somewhere in auth where something like "user@domain.com\n" was accepted as a properly formed email address. (I can't dig up the original issue offhand, it was a long time ago.)

This revision now requires changes to proceed.Thu, Jul 8, 6:14 PM

Requiring very recent Mercurial may also run into issues described here:

https://discourse.phabricator-community.org/t/prefer-old-mercurial-2-8-2-to-avoid-various-problems/3431

I don't think I ever fully triaged these, but my vague recollection is that Mercurial changed how parameters get passed over HTTP (headers became GET params, or vice versa, or something?), and I made it two or three patches down the rabbit hole and it didn't seem like I'd need to dig much deeper, but I never finished these changes up for whatever reason.

This specific issue is probably the relevant one, the other thing linked from there looks like it's just a --debug problem, which this patch should fix.

https://discourse.phabricator-community.org/t/error-cloning-mercurial-repository-http-after-first-install/1200/2

  1. Bring the 2.4 version bump upstream, without directly addressing annotate --debug.
  2. Deal with annotate --debug by either:
    1. bumping the minimum version to 4.6 in your fork and using --template; and/or
    2. defining another Mercurial version capability test for annotate --template and selecting that in versions of Mercurial which support it (this can come upstream).

Your call on whether you want to do the legwork on (2B).

I'll plan to do (2B) now that I'm more familiar/comfortable with the version-checking approach. Bumping the required version to something so much newer isn't really justified for this, especially considering the versions that are available on official distribution packages (centos7 currently has mercurial-2.6). Using --template for newer versions will be less error-prone though so I'll do the legwork. I'll try to get something up later tonight.

This specific issue is probably the relevant one, the other thing linked from there looks like it's just a --debug problem, which this patch should fix.

https://discourse.phabricator-community.org/t/error-cloning-mercurial-repository-http-after-first-install/1200/2

Oh I was looking into this post and details a little bit recently and have it in my notes to get to at some point. I might eventually poke around with this but it's bottom-of-my-list. We don't serve vcs over http but that's probably more common for publicly accessible repositories, or as a means to "simplify" CI integrations.

Do capability-check for using hg annotate --template vs --debug, set the required version to 2.4, address other comments

forgot to submit my comments~

src/applications/diffusion/conduit/DiffusionHistoryQueryConduitAPIMethod.php
154–156

I'll try coming up with something

164

Will do!

src/applications/diffusion/query/blame/DiffusionMercurialBlameQuery.php
10

Aha! I didn't realize that annotate's templating provided more features than the standard hg help template. Also that using --verbose with help commands would provide more help.

src/applications/diffusion/query/lowlevel/DiffusionLowLevelParentsQuery.php
68

Ah interesting, I'll update these

Run arc liberate for the new test file

Something isn't quite right here. Looking at the history of a repo, the graph rendering doesn't match what it was previously doing

Previously

Screen Shot 2021-07-08 at 9.22.34 PM.png (2×334 px, 122 KB)

Now

Screen Shot 2021-07-08 at 9.23.18 PM.png (2×1 px, 188 KB)

Found one issue, which seems to fix the improper graph structure but now it's rendering disjointly

Screen Shot 2021-07-08 at 9.29.25 PM.png (2×354 px, 127 KB)

src/applications/diffusion/conduit/DiffusionHistoryQueryConduitAPIMethod.php
173–174

accidentally removed the + when swapping in the \z

%C ("raw, unescaped comment")

err: raw, unescaped command rather


I'm not immediately sure how this patch could cause the disjoint rendering, miiiiiight be an unrelated display issue.

src/applications/diffusion/conduit/DiffusionHistoryQueryConduitAPIMethod.php
173–174

This case might be impossible now, I think it was a trick of the behavior of {parents}?

175

This array_filter() is possibly unnecessary -- I think both {p2node} and {p2.node} always emit a string with 000... for empty, right?

src/applications/diffusion/query/blame/DiffusionMercurialBlameQuery.php
27–28

Although --template probably overrides --changeset, it would be cleaner to move it to the --debug branch, I think (it just affects --debug output).

47
49–50

I think it would also be fine to just drop this so we can delete the method -- no one has ever noticed issues with annotate that I can recall, and this should converge toward correct behavior as users upgrade -- but up to you.

55

We'll enter this branch if : is not present on the line (strpos() returns false), or if : is present at the beginning of the string (the 0th index, strpos() returns 0). Test against false exactly to be more precise.

src/applications/diffusion/query/lowlevel/DiffusionLowLevelParentsQuery.php
64–66

This has no effect, since $hashes[$key] necessarily already has the value $value as we enter the loop.

Fix a regex typo and add back in the trim() on the hash

src/applications/diffusion/conduit/DiffusionHistoryQueryConduitAPIMethod.php
173–174

As in some edge cases where the first commit of a repo wouldn't print out 0000 for the parents? I tested the first commit and it looks to print what we need always, and I'm not aware of another situation this could come up. I'll remove.

174

So it turns out that the gaps in the display show up if you don't trim() this. Previously the entire line was trimmed -- maybe the trim should be added during the explode $parts = explode(' ', trim($line));?

175

I was testing out different test cases in a php playground
http://sandbox.onlinephpfunctions.com/code/3105014119a68d99433766a79859554b3e30dab7

I noticed that with an input of qwer vs. qwer resulted in different values for $parts here, empty array vs. array with a single empty item so I did this as a precaution. But I think you're right that since we're passing an explicit template to include both that those edge cases don't come up

$ hg log --template "{node} {p1.node} {p2.node}\n" --rev 0
48af2fc3b1a5ba27efd24041bc091f133551d61b 0000000000000000000000000000000000000000 0000000000000000000000000000000000000000
$ hg log --template "{node};{p1.node} {p2.node}\n" --rev 1
7a7d04be4a3ddea4f04fd3a84e921eec45f2bc04 48af2fc3b1a5ba27efd24041bc091f133551d61b 0000000000000000000000000000000000000000

(same exact output with {p1node} format

src/applications/diffusion/query/blame/DiffusionMercurialBlameQuery.php
27–28

I tried this on a sample file and got the exact same output when --changeset wasn't used with --template, I'll move it

49–50

I waaaaanna say I've seen some annotations not appear when reviewing certain files in our repository which leads me to think it was likely this issue where debug output caused this to choke. It's probably less likely to be reported since it wasn't displaying an error to the user, and it seems a fair number of stars had to align in our case to hit this issue.

55

Oh, if it's the first character it returns 0 which is then coerced to false? Gotcha.

I did read the PHP documentation on this

Screen Shot 2021-07-08 at 9.45.15 PM.png (598×2 px, 124 KB)

Oh maybe not the big red bit tho. I'll try to read more of those.

src/applications/diffusion/query/lowlevel/DiffusionLowLevelParentsQuery.php
64–66

ugh.gif (276×498 px, 591 KB)

Got a little too focused on the refactor and not the context

More updates based on comments. Tested out history view and annotation view and things look good.

cspeckmim edited the test plan for this revision. (Show Details)

I setup mercurial to run version 4.4, created a new repository, added some commits, and verified all the above behavior still works properly.

This isn't quite true -- annotations don't appear to be showing up. I'll take a quick look

Hmm this is the error I'm getting when using mercurial 4.4

2021/07/08 22:35:31 [error] 9693#9693: *1 FastCGI sent in stderr: "

[2021-07-08 22:35:31] PHLOG: 'Using template: --debug --changeset' at [/usr/local/phacility/phabricator/src/applications/diffusion/query/blame/DiffusionMercurialBlameQuery.php:18]

[2021-07-08 22:35:31] EXCEPTION: (RuntimeException) min(): Array must contain at least one element at [<arcanist>/src/error/PhutilErrorHandler.php:262]
arcanist(head=origin-master, ref.master=2d6452acb5f1, ref.origin-master=c94c5bbf35f0), phabricator(head=master, ref.master=14c516b7a109, custom=10)
  #0 <#2> PhutilErrorHandler::handleError(integer, string, string, integer, array)
             called at [<arcanist>/src/error/PhutilErrorHandler.php:262]
  #1 <#2> min(array)
             called at [<phabricator>/src/applications/diffusion/controller/DiffusionBlameController.php:201]
  #2 <#2> DiffusionBlameController::handleRequest(AphrontRequest)
             called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:284]
  #3 phlog(RuntimeException)
             called at [<phabricator>/src/aphront/handler/PhabricatorAjaxRequestExceptionHandler.php:27]
  #4 PhabricatorAjaxRequestExceptionHandler::handleRequestThrowable(AphrontRequest, RuntimeException)
             called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:751]
  #5 AphrontApplicationConfiguration::handleThrowable(RuntimeException)
             called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:296]
  #6 AphrontApplicationConfiguration::processRequest(AphrontRequest, PhutilDeferredLog, AphrontPHPHTTPSink, MultimeterControl)
             called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:204]
  #7 AphrontApplicationConfiguration::runHTTPRequest(AphrontPHPHTTPSink)
             called at [<phabricator>/webroot/index.php:35]

" while reading response header from upstream, client: 10.0.50.250, server: repo-testing1.mimsoftware.com, request: "POST /diffusio

I tried adding this at the end of DiffusionMercurialBlameQuery::resolveBlameFuture()

if (empty($result)) {
  return null;
}

Because it looks like the caller is expecting to handle null, but that didn't seem to change any behavior.

Oh that error happens when resolveBlameFuture() returns null because there was a problem invoking the command... looking a little more into what issue is happening

Buh there must be something going on in the environment that's causing issues. I logged out the command it's running and tried running it directly in the repo

[cspeck@repo-testing1 85]$ hg --version
Mercurial Distributed SCM (version 4.4.2)
(see https://mercurial-scm.org for more information)

Copyright (C) 2005-2017 Matt Mackall and others
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[cspeck@repo-testing1 85]$ hg annotate --debug --changeset --rev d696cbf7fb0edb99a01cd6a1656cb55c495a52a1 -- test.txt
starting pager for command 'annotate'
d696cbf7fb0edb99a01cd6a1656cb55c495a52a1: bloop!
0477d6101efb6bc4d01332bc99765791bac5f673: blah

But when that command is being run from phabricator I phlog() the stdout and get (with exit code 255)

[2021-07-08 22:54:01] PHLOG: 'Error with blame: hg annotate [-r REV] [-f] [-a] [-u] [-d] [-n] [-c] [-l] FILE...

show changeset information by line for each file

options ([+] can be repeated):

 -r --rev REV             annotate the specified revision
    --no-follow           don't follow copies and renames
 -a --text                treat all files as text
 -u --user                list the author (long with -v)
 -f --file                list the filename
 -d --date                list the date (short with -q)
 -n --number              list the revision number (default)
 -c --changeset           list the changeset
 -l --line-number         show line number at the first appearance
 -w --ignore-all-space    ignore white space when comparing lines
 -b --ignore-space-change ignore changes in the amount of white space
 -B --ignore-blank-lines  ignore changes whose lines are all blank
 -Z --ignore-space-at-eol ignore changes in whitespace at EOL
 -I --include PATTERN [+] include names mat...

Phabricator appears to have identified and using the right binary

Screen Shot 2021-07-08 at 11.08.53 PM.png (616×594 px, 49 KB)

(where /bin/hg -> /usr/local/phacility/mercurial-4.4.2/hg)

For figuring out the command it's running I'm effectively recreating the command-line -- is there an easy way to log the command that will run from the CommandFuture?

is there an easy way to log the command that will run from the CommandFuture?

For normal web requests this should appear in DarkConsole if you enable it (in the Services tab), or you can use bin/conduit call --trace ... if it's happening in Conduit.

Best guess is that it's maaaaaaybe because we pass HGPLAIN=1 in the environment (DiffusionMercurialCommandEngine) and your version of Mercurial has annotate set up as an extension? That seems like a big stretch, but wouldn't show up. Otherwise, you can maybe phlog() from DiffusionMercurialCommandEngine->newFormattedCommand().

I added some logs and then removed them but functionality nothing else should have changed (except the --template quoting I mention) and now when running hg 4.4 it no longer errors, but it the blame doesn't list changesets (but does put color next to the lines which I think is just a default if nothing was found?)

src/applications/diffusion/query/blame/DiffusionMercurialBlameQuery.php
13

I *think* not having the template itself in quotes is causing a problem but this code was working, but then I saw it break just recently when switching back to mercurial 5.8. I think it's fine to update this

Best guess is that it's maaaaaaybe because we pass HGPLAIN=1 in the environment (DiffusionMercurialCommandEngine) and your version of Mercurial has annotate set up as an extension? That seems like a big stretch, but wouldn't show up.

I don't believe annotate is setup as an extension. This is our global .hgrc

/etc/mercurial/hgrc.d/managed_settings.rc
# Ansible managed

[extensions]
#largefiles =
[largefiles]
usercache = /opt/hgrepos/.global-largefiles-store
[trusted]
users = phab-phd

Supposedly this will list the active extensions (running while hg 4.4)

$ hg debugextensions
$ 

Here's something interesting

$ hg annotate --debug --changeset --rev 0 -- test.txt
starting pager for command 'annotate'
0477d6101efb6bc4d01332bc99765791bac5f673: blah
$ HGPLAIN=1 hg annotate --debug --changeset --rev 0 -- test.txt
0477d6101efb6bc4d01332bc99765791bac5f673: blah

apparently setting HGPLAIN caused some of that extraneous log output to disappear x_x

I think I got something messed up and I'm not seeing it. I'm making changes which no longer seem to be taking effect (I'm restarting the services)

I'll take another look tomorrow. Somehow I managed to get the code to emit output and found the issue with strpos() but then after fixing that it now seems busted again for both versions of mercurial. No exceptions are happening but it's not actually building up the annotation data properly (which is just the ordered list of the commit hashes I think?)

Is it possible the annotate results are cached somewhere and might be causing some of the inconsistency I’m seeing?

Ah, yes; DiffusionBlameQuery has a cache.

I've been incorporating ./bin/cache purge --all into my test process and things are making a lot more sense now. I'll hopefully have some changes up here shortly. There's one minor bug with annotations using mercurial 4.2 for file lines that contain : not being annotated but it works on 5.8.

Update the handling of template arguments to avoid using quotes around the template being used

There's one minor bug with annotations using mercurial 4.2 for file lines that contain : not being annotated but it works on 5.8.

This was because I had made modifications to the resolveBlameFuture() in my testing. I undid these changes and things are working again. I tested both mercurial 5.8 and 4.4, purging all caches in between and verifying that annotations and history appear correctly, including annotations of files with code lines including ':'.

To sum up what I think the issues I ran into last night were

  1. I was not purging caches when switching versions of mercurial or when making changes to DiffusionMercurialBlameQuery
  2. I had improperly inverted strpos(':', $line) instead of strpos($line, ':')
  3. When using --template I think I was not properly escaping quotes for how they are eventually invoked when launching the process

Both #2 and #3 are fixed and tested now

When using --template I think I was not properly escaping quotes for how they are eventually invoked when launching the process

Yeah, I think --template blah blah blah was getting passed as a single argument. The updated version with %Ls is the way I'd do it, too. Everything looks good to me, thanks!

This revision is now accepted and ready to land.Fri, Jul 9, 4:27 PM