Page MenuHomePhabricator

Updating the filtering of Mercurial debug output
ClosedPublic

Authored by cspeckmim on Jun 28 2021, 2:42 AM.
Tags
None
Referenced Files
F14496278: D21677.id51600.diff
Fri, Jan 3, 1:43 PM
F14492820: D21677.id51603.diff
Thu, Jan 2, 10:23 PM
Unknown Object (File)
Tue, Dec 31, 10:48 PM
Unknown Object (File)
Mon, Dec 30, 10:23 PM
Unknown Object (File)
Mon, Dec 30, 2:55 AM
Unknown Object (File)
Fri, Dec 27, 1:21 PM
Unknown Object (File)
Wed, Dec 25, 6:47 PM
Unknown Object (File)
Thu, Dec 19, 1:02 AM
Subscribers

Details

Summary

With newer versions of Mercurial come newer debug messages which need filtered out.

  1. In the scenario of Phabricator observing a hosted Mercurial repository which exists on a server in a multi-user environment it's possible that a repository computes branch cache at a tip revision which is not present. When this happens Mercurial will include in the debug output this information. This message indicates that the cache is going to be re-computed. See https://www.mercurial-scm.org/pipermail/mercurial/2014-June/047239.html.
  2. Likely in some version with added or improved support for pager the debug info seems to indicate when a pager is being invoked for a command. This seems to print out regularly despite piping the stdout.
  3. If the repository on Phabricator ever had the largefiles extension enabled then some additional details about "updated patterns" will print out.
Test Plan

I verified an observed repository's history could be browsed, specifically the history of files which previously resulted in "Undefined offset: 1".

Added a unit test to check the results of filterMercurialDebugOutput().

Diff Detail

Repository
rP Phabricator
Branch
debugout
Lint
Lint Passed
Unit
Tests Skipped
Build Status
Buildable 25407
Build 35088: Run Core Tests
Build 35087: arc lint + arc unit

Event Timeline

cspeckmim held this revision as a draft.

Forgot to save the change including the pager output

Added comment for the pager output

Comparing strings not arrays!

Fix the test output to not have real-life data

Also remove the 'no terminfo entry for' line to be filtered out

Adding a check in the event other log messages get added which might cause a stumble

cspeckmim edited the test plan for this revision. (Show Details)
cspeckmim edited the test plan for this revision. (Show Details)
This revision is now accepted and ready to land.Jul 6 2021, 7:09 PM
cspeckmim added inline comments.
src/applications/diffusion/protocol/DiffusionMercurialCommandEngine.php
68

I was going to update this to be the new URL
https://www.mercurial-scm.org/pipermail/mercurial-devel/2011-February/028541.html

However using that URL results in a line that is >80 characters long. What format should be used in a situation where it doesn't make sense to split onto multiple lines? Keep the protocol/domain on one line and the path on another (in this case path is <80 chars)?

100

Just in case, I'm going to make a quick update so remove the (served) part of this since it looks like it might print something different for hosted repositories (though for a hosted repo it's unlikely to ever get into this state to begin with).

For the invalid branch cache message remove the (served): bit which might differ on hosted vs. observed repositories.

This revision is now accepted and ready to land.Jul 6 2021, 7:28 PM

Keep the protocol/domain on one line and the path on another (in this case path is <80 chars)?

I've probably gone this route at least once. Perhaps more-ideal is to not put URLs (or, generally, giant blocks of explanatory text about context) into the raw source code, and leave a brief summary and a reference to a task instead. Then the details/URLs/etc could be updated or expanded upon without changing the code.


Slightly more broadly, we seem to have three remaining callsites in the code that use --debug:

src/applications/diffusion/query/lowlevel/DiffusionLowLevelParentsQuery.php:51
'log --debug --limit 1 --template={parents} --rev %s',

This can be replaced with hg log --limit 1 --template '{p1node} {p2node}' --rev ... or similar since I added these keywords in ~2012, although they were deprecated in 2018; see https://patchwork.mercurial-scm.org/patch/36338/. The new syntax is {p1.node} and {p2.node} in recent versions of Mercurial.

src/applications/diffusion/query/blame/DiffusionMercurialBlameQuery.php:13
'annotate --debug --changeset --rev %s -- %s',

This can be replaced with hg annotate --template '{lines % "{node}\n"}' --rev ... -- ... or similar. This callsite also doesn't call filterMercurialDebugOutput(), soooo...

src/applications/diffusion/conduit/DiffusionHistoryQueryConduitAPIMethod.php:153
'log --debug --template %s --limit %d --rev %s -- %C',
'{node};{parents}\\n',

This can be replaced with log --template '{node};{p1node} {p2node}\n' ...` or similar, per above.

With these three adjustments, filterMercurialDebugOutput() could be removed (I think?) and all of the associated garbage and tests could be deleted.

On navigating the version stuff, I think the pathway would be:

  • Figure out the first version of Mercurial that has {p1node} and {p2node}. I think I upstreamed those patches in ~2012ish, but could be off.
  • Make that the new minimum version of Mercurial for Phabricator. This happens in PhabricatorBinariesSetupCheck. The current minimum version is 1.9, with versions 2.1 and 2.2 forbidden because of buggy behavior.
  • Figure out the first version of Mercurial that has {p1.node} and {p2.node}.
  • Add a p1/p2 capability PhutilMercurialBinaryAnalyzer in Arcanist, then test for that capability to decide whether to use {p1node} or {p1.node}.

(DiffusionLowLevelMercurialPathsQuery has an example of using PhutilBinaryAnalyzer to do a Mercurial capability test.)

Ah thank you for those details for the better change. I was planning to play around with what would be needed to remove the --debug flag(s) but I hadn't yet gone into the details. I'll take a look at these changes. From quick testing on recent mercurial versions (5.8) it looks like the output does not include the local revision number, e.g instead of
693:8b39f63eb209dd2bdfd4bd3d0721a9e38d75a6d3 -1:0000000000000000000000000000000000000000
it prints out
8b39f63eb209dd2bdfd4bd3d0721a9e38d75a6d3 0000000000000000000000000000000000000000

I started a draft change in D21679. I also have changes for rARC but haven't yet created a diff for that (or yet figured out the exact versions to define)

According to their changelog the {p1node} style was added in 2.4 (2012) and was deprecated in 4.9 (2019) in favor of the new {p1.node} format
https://www.mercurial-scm.org/wiki/WhatsNew/Archive#Mercurial_2.4_.282012-11-01.29

However the use of --template for the hg annotate command does not appear to work until ~4.5 (2018). I say "appear to work" because although it does output what looks like expected results,

  1. I couldn't find this behavior listed in their changelog, and
  2. The use of --template/-T doesn't actually appear in the help output for annotate in version 4.5

It is both non-functional and non-documented in at least version 4.0 (2016) though I haven't bothered checking the major versions in between.