Page MenuHomePhabricator

Improve parsing of unusual "git diff --no-index --no-prefix" diffs
Open, WishlistPublic

Description

On some special cases ArcanistDiffParser fails to parse filenames out of the patch.

Data

For example try feed following patch to diff parser:

diff --git diff_staging/solo_score_timeout.10f178f2a16f4143aedb48f9dfaba3fe.1489777879750.json diff_staging/solo_score_timeout.10f178f2a16f4143aedb48f9dfaba3fe.1489777879750.new.json
index 9fd44d5..a3bdf2e 100644
--- diff_staging/solo_score_timeout.10f178f2a16f4143aedb48f9dfaba3fe.1489777879750.json
+++ diff_staging/solo_score_timeout.10f178f2a16f4143aedb48f9dfaba3fe.1489777879750.new.json
@@ -1,3 +1,3 @@
 {
-    "reason": "arstarst", 
+    "reason": "ienen", 
 }
\ No newline at end of file

JSON representation of the patch above:

"diff --git diff_staging\/solo_score_timeout.10f178f2a16f4143aedb48f9dfaba3fe.1489777879750.json diff_staging\/solo_score_timeout.10f178f2a16f4143aedb48f9dfaba3fe.1489777879750.new.json\r\nindex 9fd44d5..a3bdf2e 100644\r\n--- diff_staging\/solo_score_timeout.10f178f2a16f4143aedb48f9dfaba3fe.1489777879750.json\r\n+++ diff_staging\/solo_score_timeout.10f178f2a16f4143aedb48f9dfaba3fe.1489777879750.new.json\r\n@@ -1,3 +1,3 @@\r\n {\r\n-    \"reason\": \"arstarst\", \r\n+    \"reason\": \"ienen\", \r\n }\r\n\\ No newline at end of file"

Reproduction steps

  1. Open https://secure.phabricator.com/conduit/method/differential.createrawdiff/ (ArcanistDiffParser is used in this API method)
  2. Fill JSON representation from above to the diff field.
  3. Submit the form.

Expected

Created a diff.

Received

An error: "#1048: Column 'filename' cannot be null"

Versions

This bug was introduced with rARCd0957c344156356123048be2219fb95a54c89a85 - every Phabricator instance with this version of Arcanist or newer has mentioned problem.
Reverting the file to previous version solves the problem.

Event Timeline

Pawka updated the task description. (Show Details)

@epriestley git diff --no-index --no-prefix -U2000 $old_file_path $new_file_path

Why is git diff --no-index being used instead of and diff, and why is --no-prefix being used?

--no-index - diff is created outside of repository by comparing directories.
--no-prefix - I'm not sure about this - some historical reasons and code in the service which generates diffs. But as I understand this was supported since D3744 and until rARCd0957c344156356123048be2219fb95a54c89a85.

epriestley triaged this task as Wishlist priority.Jul 3 2017, 10:47 AM

That change fixed a lot of bugs, so we aren't going to revert it: reverting it would break all the things it fixed and added test cases for. --no-index --no-prefix diffs have never been part of the test suite and it is just coincidence that it worked before that change.

We'll fix this, but since it never occurs on any normal flow and requires explicitly selecting multiple unusual flags (--no-index, --no-prefix) it's difficult to prioritize. Omitting the --no-prefix flag is likely a workaround until the parser is updated.

epriestley renamed this task from ArcanistDiffParser fails to parse filenames to Improve parsing of unusual "git diff --no-index --no-prefix" diffs.Jul 3 2017, 10:48 AM

I'm getting the same #1048: Column 'filename' cannot be null running just regular "arc diff". It used to work and then at some point after updates started failing.

P2077