Page MenuHomePhabricator

Parse git renames with inconsistent quoting or custom prefixes
ClosedPublic

Authored by alexmv on Aug 17 2016, 12:05 AM.
Tags
None
Referenced Files
F12838462: D16405.id39453.diff
Thu, Mar 28, 6:14 PM
F12838365: D16405.id.diff
Thu, Mar 28, 6:09 PM
F12838049: D16405.id39451.diff
Thu, Mar 28, 5:48 PM
F12837744: D16405.id39451.diff
Thu, Mar 28, 5:26 PM
F12836050: D16405.id.diff
Thu, Mar 28, 3:33 PM
Unknown Object (File)
Mon, Mar 25, 6:10 PM
Unknown Object (File)
Mon, Mar 25, 5:54 PM
Unknown Object (File)
Wed, Mar 20, 6:13 PM
Subscribers
Tokens
"Piece of Eight" token, awarded by epriestley.

Details

Summary

The previous parser failed when only one of the old/new filenames was
quoted, as with a rename of a Unicode filename to a non-Unicode
filename, or vice versa. It also incorrectly parsed custom prefixes,
even going to far as to encode this logic in the tests: a diff of
"src/file dst/file" which is not a rename should not be recorded as
changing "src/file", but rather "file".

Switch to using the "rename from" / "rename to" as the source of truth
for old and current filenames when complex renames are done. This
matches the logic that git itself uses to parse patches; the contents
of the diff --git line are merely viewed as a simplest-case
prediction, with renames handled later should it not match.

The diff parser already had logic to parse "rename from" / "rename to"
lines and extract their information. As such, this diff consists
primarily of removing logic from the splitGitDiffPaths method, and
allowing it to quietly fail.

This resolves two ambiguity mentioned in comments and tests: in
renaming "old file old" to "file", as well as the renaming of
"old file with spaces" to "new file with spaces" when neither are
quoted.

Test Plan

arc unit. Many of the existing test cases no longer
applied to the splitGitDiffPaths method; they were moved into a new
test method which also supplies values for "rename from" and "rename
to" lines.

As noted in the summary, this alters one of the expected values of an
existing test. Specifically, the following diff is a file addition of
file with custom prefixes, and not the addition of "dst/file":

diff --git src/file dst/file
new file mode 100644
index 0000000..1269488
--- /dev/null
+++ dst/file
@@ -0,0 +1 @@
+data

Diff Detail

Repository
rARC Arcanist
Branch
up-different-quoting
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 13332
Build 17101: Run Core Tests
Build 17100: arc lint + arc unit

Event Timeline

alexmv retitled this revision from to Parse git renames with inconsistent quoting or custom prefixes.
alexmv updated this object.
alexmv edited the test plan for this revision. (Show Details)
epriestley added a reviewer: epriestley.
epriestley added inline comments.
src/parser/ArcanistDiffParser.php
292

By convention, prefer $filename !== null over isset($filename) to test for null.

1319

This is kind of a junk method name now, but ehh.

1330–1332

By convention, prefer the more explicit "{$var}" over "$var" when embedding variables in strings. This should be lint-enforced but T11081 is trickier to resolve than most lint stuff is.

src/parser/__tests__/ArcanistDiffParserTestCase.php
638

"thisi s" -> "this is"

664

Prefer pht() on any human-readable text, even exceptions / unit tests / stuff you're 100% sure is completely developer-facing forever. See here for rationale:

https://secure.phabricator.com/book/phabcontrib/article/internationalization/#exceptions-and-errors

This revision is now accepted and ready to land.Aug 17 2016, 12:27 AM
src/parser/ArcanistDiffParser.php
292

:nod: I was patterning off of the other issets in the area.

1319

Renamed.

1330–1332

Sure; fixed, and noted for future changes.

src/parser/__tests__/diff/custom-prefixes-edit.gitdiff
6–18

To reduce git grep false-positives, I've made this into bland content.

alexmv edited edge metadata.

Address review comments