Page MenuHomePhabricator

Fetch SVN property changes for Diffusion
Needs ReviewPublic

Authored by altendky on Mar 10 2014, 1:34 AM.
Tags
None
Referenced Files
F14483627: D8478.id24477.diff
Mon, Dec 30, 7:49 PM
Unknown Object (File)
Sun, Dec 15, 5:59 PM
Unknown Object (File)
Fri, Dec 13, 9:47 AM
Unknown Object (File)
Sun, Dec 8, 4:07 PM
Unknown Object (File)
Wed, Dec 4, 6:53 AM
Unknown Object (File)
Wed, Dec 4, 6:53 AM
Unknown Object (File)
Wed, Dec 4, 6:53 AM
Unknown Object (File)
Wed, Dec 4, 6:53 AM

Details

Summary

Fixes T4255. Previously only the file content was fetched and as such no properties were filled out in Diffusion. This fetches old and new properties and adds them to the Arcanist diff object for later display.

Test Plan

Being my first contribution I'm not quite sure where to put a test... perhaps you could point me in the right direction now that I've made it this far? Thanks.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

After completing this (my first contribution attempt) I looked around again with a little more comfort and familiarity and it seems this whole function may be a bit backwards. It fetches the entire source for both new and old and then creates a diff from that. I added fetching of new and old properties as well. I think there is code in ArcanistDiffParser.php that may actually parse a raw SVN diff which would make this a whole lot simpler and faster... but this at least seems to work for my cases and I need to work on work stuff tomorrow. :]

This could also become a foreach (array('old', 'new') as $age) loop. It is very repetitive as it is now.

src/applications/diffusion/conduit/ConduitAPI_diffusion_diffquery_Method.php
121 ↗(On Diff #20090)

Throughout I need to revert to local_variables_with_underscores formatting.

altendky edited edge metadata.

Correct variable names to underscore_format

Are the properties present in plain svn diff? We have code to parse raw diffs, the raw diffs just don't have all the information we want. If they have all the relevant property information, it would be much simpler to run one svn diff instead of all this.

This construction is unsafe:

$this->buildSVNFuture($new, 'propget '.$p);

If I set a property named, for example:

`rm -rf`

...it will execute arbitrary commands when the diff is viewed.

altendky edited edge metadata.

Remove .local copy

Sorry for the mess, I need to neaten up my Git skills.

I saw the other code for parsing raw diffs after I put this together so I'll go look into using it. It does seem to have some ability to deal with the svn diff output for properties (which is included). My SVN server is extra slow so I will likely benefit even more than most from reducing the number of futures. I should also have mentioned the security hazard immediately so I wouldn't forget it. Thanks for being sure to bring it back up for me.

When I find time to get back to this, should I update this diff or create a new one?

Get all SVN properties for a revision at once

(Actually) Get all SVN properties at once

While trying to figure out how to get the properties with one fell swoop (SO it was suggested that SVN bindings could be better. Is there a particular reason that the SVN CLI was used instead?

I think this may be ready for another review? The security hazard should be removed and the existing diff parsing is being leveraged. Thanks for your time.

Correct 'Only variables should be passed by reference' warning

Actually correct 'Only variables should be passed by reference' warning

Am I on the right track here? Is this of interest, or should I plan on maintaining it in a branch for myself? Thanks for your consideration.

aik099 added inline comments.
src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

@altendky, was there any reason to do it like this instead of solution you've explained in D8478#56149 ?

The current diff approach caused exception for me because particular file wasn't existing at 0 revision:

Command failed with error #1!
COMMAND svn --non-interactive --no-auth-cache --username 'xxxxx' --password 'xxxxx' diff -r 0:'805' --depth empty 'svn://repo.url.com/tools/cron.php@805' 

STDOUT 
(empty) 

STDERR 
svn: Unable to find repository location for 'svn://repo.url.com/tools/cron.php' in revision 0
src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

The point is that the file does not exist at rev 0 and therefore diff will provide the entire contents of the properties at the requested revision. Clearly, not all SVN are equal in this regard though. :[ I see your issue when I manually use SVN client 1.6.11 but my server is presently running 1.8.9 and works. What version are you using?

But, to your actual question... It's been awhile but I think that the way I have it here now provides standard diff formatted results that can be parsed by the pre-existing diff parser. The proplist method would require new parsing code. Not necessarily the end of the world to add more parsing code but it seemed good to avoid if possible. In this case I guess the cost is backwards compatibility with SVN clients. Sorry this is causing you trouble.

src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

Thanks for detailed explanation.

The point is that the file does not exist at rev 0 and therefore diff will provide the entire contents of the properties at the requested revision.

I was hoping for that to happen according to several StackOverflow discussions.

What version are you using?

My server and client are both 1.6.x. I guess this 0-revision trick (hopefully it's documented somewhere in SVN book) was introduced after 1.6.x releases.

It's been awhile but I think that the way I have it here now provides standard diff formatted results that can be parsed by the pre-existing diff parser.

I've got that impression as well, when output properties in proplist and diff commands differed radically.

Sorry this is causing you trouble.

No trouble at all. Actually every differential revision (that isn't merged to upstream) I wanted to merge to my Phabricator fork had issues with SVN 1.6 support in one way or another. So this was just a matter of detecting that an SVN version-specific command is used.

In this case I guess the cost is backwards compatibility with SVN clients.

Trick described at http://stackoverflow.com/questions/1632165/svn-how-to-get-the-first-revision-of-a-file would allow (with a separate SVN call, but I haven't tested it) get 1st revision file/folder was introduced in. I guess then I can put it instead of 0 in that diff command and all should be well.

src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

If the file/directory were added with properties then the diff would be incomplete then? Could still get diff information but not the full context that Phabricator usually operates with (IIRC). That said, the property display doesn't do line or word highlighting and Differential doesn't provide full-context on properties either.

src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

If the file/directory were added with properties then the diff would be incomplete then?

You mean if it was added without properties, but properties were added in later commits, then yes. Same would be in your current implementation as well.

Could still get diff information but not the full context that Phabricator usually operates with (IIRC).

Are you sure? Doesn't Phabricator support context expansion on SVN property diffs?

Can you attach various screenshots on how SVN property display would be rendered? E.g.:

  • adding/removing/changing single-line property
  • adding/removing/changing multi-line property
  • when file has 2+ properties, but only one of them was changed

I think that instead of getting initial set of SVN properties from 1st file revision we should be getting them from previous file revision (previous to one we're diffing with), but it's not always this revision minus 1 especially when only sub-path of SVN repository is imported in Phabricator (I have a method for doing so in https://github.com/aik099/phabricator/commit/5db17bc94b4d623ebfccc4756918494cb81bcdcf). In such case we get 100% context and needed property changes. In attempt to determine previous revision we should also account for case, when file/folder was added in this revision only, so the svn log won't return anything.

src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

Perhaps I can setup a demo later but for now I'll just comment. The present implementation diffs between zero and the old revision to get the entire property set of the old revision and then between zero and the new revision to get the entire property set of the new revision. This should get the full context of each revision, though it may be a bit misleading to use 'diff' to get everything rather than it's normal use for, well, just differences. Diffing instead between creation or last change of a file and it's new revision will only display what has been changed between them therefore not providing anything beyond the SVN provided context. So, the method here will get the full content of both old and new properties regardless (since diffing against a revision where the file does not exist) but diffing between existing revisions may not. If you create a file with properties and then change only one line of one property the diff method you propose may not get the rest of the context of the property.

In short, to get the diff you need only compare the old and new revision, but to get the full context you must get the entire property set of each revision being compared. I thought what I did here matched most closely with the existing handling of the file content (get the full content of each revision and provide it to the renderer). As of the Phabricator version I am running (a few months back), the renderer just shows old in red and new in green without any processing of what has changed within the text (line/word diffing).

src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225
  • diffs between zero and the old revision to get the entire property set of the old revision
  • between zero and the new revision to get the entire property set of the new revision

Ah, I missed that one while reading the code. I thought it's diffing between old and new revision.

So then I just replace 0 with 1st revision in which file was created (see link to SO discussion I've mentioned before) and we're all set, right?

As of the Phabricator version I am running (a few months back), the renderer just shows old in red and new in green without any processing of what has changed within the text (line/word diffing).

That's a pity. I guess this needs to be looked/discussed separately.

src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

So then I just replace 0 with 1st revision in which file was created (see link to SO discussion I've mentioned before) and we're all set, right?

What happens when a file is added with properties and then they are later changed? The diff may only show the changes but not the content that was initially committed. Granted, most properties are small and standard SVN context might get everything, but it would be better not to bank on that.

I had meant to look at the rendering of line/word diffing of properties after this but it hasn't been accepted and I haven't otherwise gotten to it. At this point the Differential properties issues are probably higher priority for me personally anyways.

src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

What happens when a file is added with properties and then they are later changed? The diff may only show the changes but not the content that was initially committed. Granted, most properties are small and standard SVN context might get everything, but it would be better not to bank on that.

But you said yourself, that 0 would be replaced with 1st revision file was added in. So what I proposing shouldn't change anything?

I had meant to look at the rendering of line/word diffing of properties after this but it hasn't been accepted and I haven't otherwise gotten to it.

I guess same SVN properties diff engine is used on Differential Revisions as well, so you can test it there.

src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

But you said yourself, that 0 would be replaced with 1st revision file was added in. So what I proposing shouldn't change anything?

Looking back I'm not sure where I said that. Could you refresh my memory with a link/quote? My intent is to simply get the entire old property set and the entire new property set. I am only using the misleading diff-vs-rev0 approach because I was unable to find an SVN command to directly get the entire set of properties for a file/directory revision in one call (maybe also because of the handiness of the already-parseable diff format). Diffing against 0 (aka, an empty set of properties) will certainly not do the same thing as diffing against the first revision which may or may not have properties.

src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

Looking back I'm not sure where I said that. Could you refresh my memory with a link/quote?

Here you go: D8478#inline-41230 (The point is that the file does not exist at rev 0 and therefore diff will provide the entire contents of the properties at the requested revision.).

Ah, now I get it. If a property was in 1st revision and never changed since then, then diff won't show it. So we're back at square 1 :( . Any idea how to solve that 0-revision problem using other command?

Currently it looks like parsing XML output of the svn://some.repo.com/path/to/folder/file could be much easier, than expected (4 lines): 1 line to create XML object and 3 lines for foreach.

<?xml version="1.0"?>
<properties>
<target
   path="svn://some.repo.com/path/to/folder/file">
<property
   name="test:multi_line_new_no_eol">line 1
line 2
line 3</property>
<property
   name="svn:ignore">Thumbs.db
.htaccess
</property>
<property
   name="test:multi_line_new">line one
line two
line three
</property>
<property
   name="test:single_line_new">this is single line</property>
</target>
</properties>
src/applications/diffusion/conduit/DiffusionDiffQueryConduitAPIMethod.php
225

Sounds like that might be a workable/better approach. I focused on my method because it seemed closest to the existing technique of using the CLI and it's output. Be sure to copy me (or associate to the ticket) if you make another revision. :]

I'm done. Here is how it looks:

Case #1: 3 svn properties were added
Issues:

  • no No newline at end of file message for property which value doesn't end with new line
  • no way to see not changed properties (context expansion)
  • no clear separation (odd/even row coloring) of property values in diff

Phabricartor_Commit_3PropertiesAdded.png (372×870 px, 55 KB)

Case #2:: value of multi-line property changed
Issues:

  • no indication of what was added ("line four" was added)
  • no indication of what was removed ("line two" was removed)

Phabricartor_Commit_MultilinePropertyValueChanged.png (282×874 px, 45 KB)

Conclusions: it appears, that current way of displaying SVN properties was implemented based on svn properties which don't have multi-line values (e.g. "svn:eol-style" and such).

You can merge following change https://github.com/aik099/phabricator/commit/848a0a1c9574ac86399d3acc5b3b9afb1278fcb5 to your Phabricator fork to see if it works on your commits.

Another point of interest could be showing all SVN properties of a viewed folder/file in Diffusion.