Page MenuHomePhabricator

Commit message text formatting in remote git repository (SSH) is not preserved
Closed, ResolvedPublic

Assigned To
Authored By
GMTA
May 12 2014, 1:44 PM
Referenced Files
F154619: Screen_Shot_2014-05-14_at_5.27.23_AM.png
May 14 2014, 12:29 PM
F154613: Screen_Shot_2014-05-14_at_5.21.38_AM.png
May 14 2014, 12:29 PM
F154616: Screen_Shot_2014-05-14_at_5.23.03_AM.png
May 14 2014, 12:29 PM
F154621: Screen_Shot_2014-05-14_at_5.28.59_AM.png
May 14 2014, 12:29 PM
F154599: diffusion-show-commit.png
May 14 2014, 11:36 AM
F154597: git-show-commit.png
May 14 2014, 11:36 AM

Description

After I push a commit with a message containing newlines, I update the git repository in phabricator using bin/repository update [call sign here]. When I view the new commit using Diffusion, the newlines are not preserved causing the commit message to be squashed into one single line.

Event Timeline

GMTA renamed this task from Commit message text formatting in remote repository (SSH) is not preserved to Commit message text formatting in remote git repository (SSH) is not preserved.
GMTA raised the priority of this task from to Normal.
GMTA updated the task description. (Show Details)
GMTA added a project: Diffusion.
GMTA updated the task description. (Show Details)
GMTA added a subscriber: GMTA.

+@joshuaspence

T4813 essentially reports the opposite issue, although I think it's discussing text in READMEs specifically.

I guess I'll enable "preserve-linebreaks" here and disable it in READMEs and maybe everyone will be happy?

@joshuaspence, was T4813 specific to READMEs? We don't preserve currently linebreaks in remarkup in commit messages, from what I can see.

The likely outcome of such a change is that everyone is unhappy and I have to add configuration. T.T

I think READMEs which we interpret as remarkup should unambiguously not preserve linebreaks. This is consistent with GitHub and the current rule is silly. You can use "README.txt" to specifically select text rendering.

Commit messages are more ambiguous. I think there are two major possibilities:

  • Leave them rendering as remarkup, but preserve linebreaks. This will work fine for Phabricator (we don't linewrap commit messages) and fine for installs which do linewrap and want line wrapping (it will be preserved) but not for installs that linewrap but use remarkup and want it to unwrap in the web view, like T4813 maybe originally discusses.
  • Provide options, like: display commit messages in this repository as: plain text; text-mode remarkup; normal remarkup (default). Options are terrible but this is maybe not completely awful. We probably need/want a little dropdown to switch modes too, I guess.

D9049 addresses the "linebreaks are incorrectly respected in READMEs" issue.

My personal opinion is that manual text wrapping in any system because of an arbitrary character limit is just plain awful, but a lot of git users seem to apply it. I think a sane default should be text-mode remarkup (suggestion 1). If users would like to automatically unwrap manually wrapped texts, wouldn't that be served best with a non-default option?

I may not understand the issue here, but I can't replicate this issue. If I view a commit in Diffusion which has linebreaks in the commit message, then I see linebreaks in Diffusion as well. I'm going to reopen T4813 because it is not quite the same as this.

{F154584}

I think it has something to do with the double breaks being parsed while single breaks are ignored. Two views of the same commit in git and Diffusion:

git-show-commit.png (96×450 px, 4 KB)

diffusion-show-commit.png (55×1 px, 4 KB)

The line breaks in my case are all single line breaks. Maybe @epriestley can shed some light on what is the expected behaviour?

I'm merging T4813 back in here because I'm pretty convinced they're conflicting requests. You both seem to be seeing different behaviors which are the opposite of your desired behaviors. The behaviors I expect in both of those specific cases are the opposite of the behaviors those screenshots reveal. Let me look at this...

Okay, I think I know what's going on here. Here are some examples of what happens right now, illustrating why this isn't trivial:

First, this was my original test case earlier on, which does not preserve linebreaks (@joshuaspence's desired behavior):

Screen_Shot_2014-05-14_at_5.21.38_AM.png (172×467 px, 26 KB)

Here's a case similar to @joshuaspence's, which does preserve linebreaks. This is @GMTA's desired behavior.

Screen_Shot_2014-05-14_at_5.23.03_AM.png (173×564 px, 29 KB)

Here's an example similar to @GMTA's example, which does not preserve linebreaks again:

Screen_Shot_2014-05-14_at_5.27.23_AM.png (173×580 px, 30 KB)

But here's a similar example which does:

Screen_Shot_2014-05-14_at_5.28.59_AM.png (187×527 px, 28 KB)

Can you figure out the rule?

The answer is that Git didn't add the %B ("unwrapped subject and body") format specifier for git log until relatively recently, so we pull messages using %s%n%n%b, under the assumption that these are approximately the same. In fact, they are not at all in this case: git unwraps the beginning of the commit message.

>>> orbital ~/repos/POEMS $ git log HEAD^ -n1 --format='%B'
Grocery List:
- Apples
- Bananas
- Cherries

>>> orbital ~/repos/POEMS $ git log HEAD^ -n1 --format='%s'
Grocery List: - Apples - Bananas - Cherries

Without using %B (or doing something more aggressive/horrible with the parser), I don't think we can retrieve the wrapped message in older Git, so fixing that issue may be blocked on T3046.

If we fixed that issue somehow (e.g., wave a magic wand and cause everyone to update to a git with %B, then switch from %s%n%n%b to %B), the behavior would then be what @GMTA wants (that is, linebreaks preserved). (This is actually the expected behavior right now, I was misled by the "line1\nline2\nline3" test earlier.)

This directly conflicts with what @joshuaspence wants (that is, linebreaks ignored). In this specific case, it actually wouldn't matter: we handle lists properly regardless of linebreaks, as in the 4th example. However, I assume you'd still want this even if the %B issue was fixed, @GMTA (that is, you have other problematic commit messages which you'd prefer to preserve linebreaks on, and which are bad for reasons other than improper list formatting)?

If so, I think we have to add an option to resolve this, or I have to reject one of these behaviors. This is sort of moot until the %B thing is resolved somehow since the current behavior is just inconsistent and broken, but if we fix %B the behavior will be specifiable and consistent and these two requests will be at odds with one another.

However, I assume you'd still want this even if the %B issue was fixed, @GMTA (that is, you have other problematic commit messages which you'd prefer to preserve linebreaks on, and which are bad for reasons other than improper list formatting)?

Looking back in our git history, I think the problem is most prevalent with lists in commit messages without that extra newline at the beginning. Most of the other messages follow the git commit message "best practices" so they have a summary line at the beginning, followed by two newlines. If we keep to those guidelines, this issue is resolved for us.

The idealist in me however, would like those messages to be displayed exactly like the 'git show' output, but feel free to close this task!

With regards to git and %B, it definitely seems that %B is more desirable than %s%n%n%b. We could default to %B but fall-back to %s%n%n%b if the git version installed doesn't support %B (and probably raise a setup issue saying that "git version x.y.z is recommended").

I wouldn't say that my request is in conflict with this request. It seems that this ticket is with regardless to how Phabricator is parsing commit messages. My request is purely aesthetic with regards to how remarkup is rendered for display (and this extends beyond commit messages).

We could default to %B but fall-back to %s%n%n%b if the git version installed doesn't support %B (and probably raise a setup issue saying that "git version x.y.z is recommended").

Yeah, I think we should do this -- but don't currently have version detection, until T3046.

I wouldn't say that my request is in conflict with this request.

Except for the bug (which affects both requests) this is a rendering issue in both cases. When a commit message contains this raw text:

This is a short
freeform poem
about spaghetti.

...you would like it to render like this in the UI:

This is a short freeform poem about spaghetti.

...while the original request would like it to render like this in the UI:

This is a short
freeform poem
about spaghetti.

These are not compatible.

Are we at an impasse here? How can I contribute to this task?

Fixing the bug is blocked by VCS binary version detection (T3046).

Fixing that bug may be complicated because I think we may not be able to just run git --version to figure it out. In particular, the binary version on one host may inform decisions on other hosts (T2783), and be a per-repository artifact rather than a per-machine artifact. So this may really be blocked by T2783, T5833, etc.

You can work around this locally by replacing %s%n%n%b with %B in DiffusionLowLevelCommitQuery (assuming you only run recent versions of Git). You could fix this in the upstream without fixing T2783 first by finding an alternate way to get the raw commit message without using %B and without issuing a second command. I suspect one does not exist.

Hi I recently started experimenting with phabricator with my team and stumbled on this issue.

Text based markup format such as markdown, rst, and even tex often treat a single linebreak as basically a space, and only honor two or more consecutive linebreaks as a true paragraph break.

Is it possible for remarkup to adopt this strategy, as opposed to treats even singe linebreaks as a paragraph breaks?

People working under a traditional editor such as emacs and vim often manually wrap (or reflow) their paragraphs while editing a differential message and commit message.

People working with an editor that autowrap their lines without inserting linebreak characters, such as the HTML textarea and Word alikes, tends to insert two consecutive linebreaks anyway to separate paragraphs, like what I am doing now.

Moreover, the current behavior of Remarkup is inconsistent in how it treats simple text paragraphs and lists. Text paragraphs have their single linebreaks preserved, but lists actually concatenate line breaks. Example below:

Regular paragraphs

Manually line-wraped long text long text long 
text long text long text text long text long 
text long text long text text long text longtext

A new paragraph, separated by two linebreaks
Manually line-wraped long text long text long 
text long text long text text long text long 
text long text long text text long text longtext

gets turned into

Manually line-wraped long text long text long
text long text long text text long text long
text long text long text text long text longtext

A new paragraph, separated by two linebreaks
Manually line-wraped long text long text long
text long text long text text long text long
text long text long text text long text longtext


But a list with line breaks has the breaks joined

- Manually line-wraped long text long text long 
  text long text long text text long text long 
  text long text long text text long text longtext

  A new paragraph, separated by two linebreaks 
  Manually line-wraped long text long text long 
  text long text long text text long text long 
  text long text long text text long text longtext

- A new item
  • Manually line-wraped long text long text long text long text long text text long text long text long text long text text long text longtext

    A new paragraph, separated by two linebreaks Manually line-wraped long text long text long text long text long text text long text long text long text long text text long text longtext
  • A new item

This convention might help with the GIT commit log display problem.

I can't find any changelog in any version of Git which mentions the introduction of %B. The %B behavior was introduced in this commit in March, 2010:

https://github.com/git/git/commit/1367b12ad623e28546ba40c435015d94e7fbb248

The earliest release tag that commit is reachable from is 1.7.2-rc0. Git 1.7.2 was released in September, 2010, so this makes sense chronologically.

At time of writing, the code is present on the v1.7.2 tag and not present on the v1.7.1 tag, so I'm going to conclude this is likely just missing from the 1.7.2 changelog and condition this behavior on git >= 1.7.2.

To follow up on this:

Is it possible for remarkup to adopt this strategy, as opposed to treats even singe linebreaks as a paragraph breaks?

Yes, this is possible. In fact, Remarkup already adopts this strategy in some contexts.

The problem posed by this report is that different users disagree about which strategy we should use.

Some users feel a\nb should treat the newline as a space.
Some users feel a\nb should treat the newline as a newline.

See T5028#58832, above, for more discussion of this.

This is basically independent of the actual buggy behavior here (unfaithful extraction of raw commit messages from Git because %B does not exist prior to v1.7.2).

D21027 should change the extraction behavior to be as faithful as is realistically possible across all versions of Git. In versions of Git 1.7.2 or newer, extraction should now be completely faithful to the raw message.

Phabricator could separately have some sort of configuration option to control the display wrapping behavior, perhaps like this:

In this repository, display commit messages as [ raw text          V].
                                               [ unwrapped text     ]
                                               [ raw remarkup       ]
                                               [ unwrapped remarkup ]

But that's outside the scope of this task and I don't believe there are any outstanding customer requests for it, so I have no current plans to pursue it. If you're a customer and would like support for this, file a feature request against your support channel.

I do plan to revisit this UI in T9713, but it's very unlikely that my revisit will include the addition of display options (see T8227).