Page MenuHomePhabricator

Parse git renames with inconsistent quoting or custom prefixes

Authored by alexmv on Aug 17 2016, 12:05 AM.



The previous parser failed when only one of the old/new filenames was
quoted, as with a rename of a Unicode filename to a non-Unicode
filename, or vice versa. It also incorrectly parsed custom prefixes,
even going to far as to encode this logic in the tests: a diff of
"src/file dst/file" which is not a rename should not be recorded as
changing "src/file", but rather "file".

Switch to using the "rename from" / "rename to" as the source of truth
for old and current filenames when complex renames are done. This
matches the logic that git itself uses to parse patches; the contents
of the diff --git line are merely viewed as a simplest-case
prediction, with renames handled later should it not match.

The diff parser already had logic to parse "rename from" / "rename to"
lines and extract their information. As such, this diff consists
primarily of removing logic from the splitGitDiffPaths method, and
allowing it to quietly fail.

This resolves two ambiguity mentioned in comments and tests: in
renaming "old file old" to "file", as well as the renaming of
"old file with spaces" to "new file with spaces" when neither are

Test Plan

arc unit. Many of the existing test cases no longer
applied to the splitGitDiffPaths method; they were moved into a new
test method which also supplies values for "rename from" and "rename
to" lines.

As noted in the summary, this alters one of the expected values of an
existing test. Specifically, the following diff is a file addition of
file with custom prefixes, and not the addition of "dst/file":

diff --git src/file dst/file
new file mode 100644
index 0000000..1269488
--- /dev/null
+++ dst/file
@@ -0,0 +1 @@

Diff Detail

rARC Arcanist
Lint Passed
Tests Passed
Build Status
Buildable 13332
Build 17101: Run Core Tests
Build 17100: arc lint + arc unit

Event Timeline

alexmv retitled this revision from to Parse git renames with inconsistent quoting or custom prefixes.
alexmv updated this object.
alexmv edited the test plan for this revision. (Show Details)
epriestley added a reviewer: epriestley.
epriestley added inline comments.

By convention, prefer $filename !== null over isset($filename) to test for null.


This is kind of a junk method name now, but ehh.


By convention, prefer the more explicit "{$var}" over "$var" when embedding variables in strings. This should be lint-enforced but T11081 is trickier to resolve than most lint stuff is.


"thisi s" -> "this is"


Prefer pht() on any human-readable text, even exceptions / unit tests / stuff you're 100% sure is completely developer-facing forever. See here for rationale:

This revision is now accepted and ready to land.Aug 17 2016, 12:27 AM

:nod: I was patterning off of the other issets in the area.




Sure; fixed, and noted for future changes.


To reduce git grep false-positives, I've made this into bland content.

alexmv edited edge metadata.

Address review comments