Page MenuHomePhabricator

Update the "local:commits" property after amending
Needs ReviewPublic

Authored by alexmv on Nov 18 2016, 3:04 AM.
Tags
None
Referenced Files
F14090044: D16897.diff
Sun, Nov 24, 3:16 PM
Unknown Object (File)
Fri, Nov 22, 10:33 PM
Unknown Object (File)
Wed, Nov 20, 10:10 PM
Unknown Object (File)
Wed, Nov 20, 9:26 AM
Unknown Object (File)
Mon, Nov 18, 6:20 PM
Unknown Object (File)
Oct 15 2024, 6:44 PM
Unknown Object (File)
Oct 13 2024, 2:45 AM
Unknown Object (File)
Oct 9 2024, 6:10 AM
Subscribers

Details

Reviewers
None
Group Reviewers
Blessed Reviewers
Summary

In most cases, the local commit object is amended after creating the
revision; this changes its commit-id to no longer match the
"local:commits" property that was uploaded originally.

Update the property after the local amend. This also moves the
"differential.getcommitmessage" step to only be called if it is
necessary, saving an API round-trip in some rare cases.

Test Plan

From the creation of this diff, and hence using this code:

Updating commit message...
>>> [115] <conduit> differential.getcommitmessage() <bytes = 145>
>>> [116] <http> https://secure.phabricator.com/api/differential.getcommitmessage
<<< [116] <http> 244,801 us
<<< [115] <conduit> 245,048 us
>>> [117] <exec> $ git commit --amend --allow-empty -F '/tmp/2ohlrfe9wd44cckg/27827-KGfRfo'
<<< [117] <exec> 10,387 us
>>> [118] <exec> $ git rev-parse 'HEAD'
<<< [118] <exec> 5,456 us
>>> [119] <exec> $ git merge-base 'fad85844314b151994769a461825c90f7400c145' 'fb0188dc55bf193410e7f2ff38534c9dcf1daa47'
<<< [119] <exec> 4,646 us
>>> [120] <exec> $ git log 'fb0188dc55bf193410e7f2ff38534c9dcf1daa47' --not 'fad85844314b151994769a461825c90f7400c145' --format='%H%x01%T%x01%P%x01%at%x01%an%x01%aE%x01%s%x01%s%n%n%b%x02' --
<<< [120] <exec> 4,459 us
>>> [121] <conduit> differential.setdiffproperty() <bytes = 1,673>
>>> [122] <http> https://secure.phabricator.com/api/differential.setdiffproperty

On the diff, "Commits" contains fb0188dc55bf, and that matches the local commit object:

$ git format-patch --stdout -1 | head -n4
From fb0188dc55bf193410e7f2ff38534c9dcf1daa47 Mon Sep 17 00:00:00 2001
From: Alex Vandiver <alexmv@dropbox.com>
Date: Wed, 9 Nov 2016 12:29:19 -0800
Subject: [PATCH] Update the "local:commits" property after amending

Diff Detail

Repository
rARC Arcanist
Branch
up-amend-local-commits (branched from master)
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 14566
Build 19004: Run Core Tests
Build 19003: arc lint + arc unit

Event Timeline

alexmv retitled this revision from to Update the "local:commits" property after amending.
alexmv updated this object.
alexmv edited the test plan for this revision. (Show Details)
alexmv edited edge metadata.

Why is this important? Specifically, why does it provide more value to the user than the extra VCS calls and an extra API roundtrip cost them?

This is a necessary, but not sufficient[1], step to be able to make the server better able to reason about stacked diffs. We use this as part of our CI integration, but it is currently used in the arc patch workflow as part of the message when the base commit is unavailable; it is used to suggest This diff is against commit D12345 rather than against commit 8badf00d.

The problem is that this data is currently _mostly_ reliable, after users add a second commit to the revision. This makes it actually reliable.

[1] It is insufficient because apparently updating the "local:commits" property does not automatically update the differential_revisionhash table, which retains the incorrect value. I've not investigated why yet; thoughts appreciated.

Since we record the tree hash and this amend never changes the tree hash, I think this should only even possibly be able to improve reasoning for a tiny subset of empty commits and revert commits? Empirically, T11518 discusses an existing extension which can reason about stacked commit state without this change, and I imagine they didn't fake that screenshot or anything.

Generally, I don't want to make arc worse for all users to make it better for a tiny subset of users, especially if those users are essentially only users who use a non-Harbormaster-based custom CI integration.

If this is legitimately important, we can implement it without a performance penalty by adding more state on the server side (T11518), delaying the property update until amends are complete, and having the property update push the revision into a "ready" state.

As implemented, this also creates issues where upstream CI integration (via Harbormaster) may trigger from differential.createrevision and race against this update, running before the local commits are updated. Although this isn't truly relevant because Harbormaster does not rely on local state, it makes it even less likely that any CI integration other than your non-Harbormaster integration would benefit.

(I don't think this is unambiguously a bug, either, because the hash legitimately reflects the commit we submitted.)

See also T11355.

I think this is better approached by first filing a task describing the problems you're facing at a high level. Then, we can make sure changes under the aegis of T11429 accommodate any use cases which we're comfortable supporting in the upstream. Custom CI integration based strictly on local state (e.g., not Harbormaster-based, and tied directly to local commit hashes -- versus just trying to reason better about relationships between local state and remote state) is probably not something we are excited to support, however.

That said, I believe putting the getcommitmessage call inside the shouldAmend() check is an unambiguous improvement with no downsides, so I'm happy to bring that upstream if you want to diff that separately.

T3875#182020 also has some discussion about broader sources of "commit equivalency", some of which are strictly external anyway and will require fully modularizing the lookup to support.