Page MenuHomePhabricator

Review comments lost or posted to wrong diff
Closed, ResolvedPublic

Description

For quick context, I am @yelirekim's PM at Uber ATC.

We have a recurring issue that I can't seem to reproduce, the ticket has been open on our install since February. Sporadically, we have users' diff comments either disappear completely, or appear on the wrong diff upon submission. The common thread here seems to be the users have multiple diffs open simultaneously in different tabs and somehow the comments seem to be getting mishandled.

Here is a description from one of our users:

I usually have multiple browser tabs all day long. Very likely multiple reviews open as well.
we do not see comments on the review we are posting to after we post. Before we post, the comments are all visible as unsubmitted. After we hit submit the page updates, all comments are gone however the action associated with the submit (for example request review) gets applied. So you see "requested review" with no comments.

It's kinda coming to a head where we have extremely angry people asking us why it's not fixed yet, and I understand why they are upset, because we generally have a high reverence for the review process here, and people put significant amounts of time into it, and then their comments are just "gone".

I did some scouring of the bug backlog and I believe T8852 is an instance of the same bug, and perhaps this user erroneously associated having the chat panel open as a part of the root cause. No one in our org uses Conpherence, but we are still seeing this happen.

I realize these aren't perfect "steps to reproduce" but this problem is fairly intermittent and yet when it does occur it can result in significant user frustration and loss of work. The high level steps seem to be

  1. Have several diffs open in different tabs simultaneously
  2. add comments and submit review feedback.

For the most part our users are using Chrome on Ubuntu, with a few using Chrome on Mac.

Event Timeline

When the Conpherence column is open, it activates a completely separate browsing mode (Quicksand) where all navigations occur over Ajax. This is expected to have weird bugs associated with it, but this mode is not active if the column is not open. If the users this is affecting have never opened the chat column, I suspect this is not the same as T8852.

It's vaguely possible that this is related to T5031. Differential uses an older draft system which is subject to some race conditions, but the effect of those conditions has historically been to fail to clear drafts after submission, not to remove content.

I haven't ever experienced this myself, and don't have any real ideas about what could be causing it. I primarily use Safari, but routinely have many dozens of tabs open. The form submission does not use any particular magic; I would generally expect any problems with it to affect every other application and every browser.

I'll make an effort to reproduce this but I suspect we're missing some key piece of information.

Here's how I attempted to reproduce this:


In Chrome, I opened 20 tabs with the same revision. I applied various state changes and comments to different tabs. Here are some examples:

Screen Shot 2016-06-03 at 11.50.26 AM.png (1×1 px, 183 KB)

Screen Shot 2016-06-03 at 11.50.28 AM.png (1×1 px, 183 KB)

I submitted these in random order while making more comments, reloading tabs, etc. Note that tabs do interact with one another through the draft system, but I was unable to produce any case where the textarea showed comments, I submitted, and the comments were not persisted on the same revision. Here's the outcome, showing a variety of state changes with comments preserved:

Screen Shot 2016-06-03 at 11.51.31 AM.png (815×1 px, 141 KB)


I opened 10 additional tabs with a different revision, then randomly swapped between the tabs belonging to each revision making comments, making state changes, reloading tabs, and submitting. I wasn't able to lose any data here, either:

Screen Shot 2016-06-03 at 11.57.02 AM.png (568×1 px, 109 KB)


I also wasn't able to reproduce this even with the chat column open. With the column open, I wrote comments and navigated between revisions, randomly navigating, going backward, editing the draft, or submitting. My comments were consistently preserved:

Screen Shot 2016-06-03 at 11.58.57 AM.png (1×1 px, 257 KB)

epriestley added subscribers: joshuaspence, sshannin.

Although I'm not sure this has the same root cause as T8852, I can't reproduce that either.

We can blindly move Differential to EditEngine, which may fix this indirectly, but this is a large amount of work.

Otherwise, I'm not sure how to move forward here without a reproduction case.

Thanks for the quick research/reply.

The lack of clear reproduction case is the reason we pushed back on our users for a few months. :) We didn't want to come to you with "not sure why, but this thing randomly happens.." However, unfortunately.. we're not sure why, but this thing randomly happens, ha. It has recurred enough that we are getting significant heat to find a solution.

Can you think of anything else that would help provide better detail as to exactly what's happening? I'm happy to work with users to collect detailed information on when/how this is occurring, but all we've gotten over the last few months is what I provided above ("I do a lot of reviews simultaneously, and sometimes this happens.")

I think that this has only ever happened on Chrome on Ubuntu, and to boot, we've seen client side Ubuntu bugs pop up in the past. (Brad do you remember where?)

Possibly could there be a browser bug which prevents inlines from being submitted?

Oh, are "comments" in this description inline comments, versus blocks of comment text?

Haha, okay. I may be headed down the wrong path -- T5031 is almost certainly not related here, because the draft system and EditEngine don't interact directly with inline comments. This is almost certainly also not Quicksand-related either if the inlines appear correctly originally (Quicksand could potentially mangle request in-flight).

I'll go on a more inline-focused fishing expedition and see if the stars align.

which ubuntu can we fire up? 16.04?

That's like... 2.00 revisions ago, by maths.

I emailed evan a "print page" pdf of the ticket off of our install.

:| yes. these are inlines.

Sorry for the lack of specificity. I don't do enough (any) code reviews myself to realize how important that distinction was in the UI and I just checked and none of the users specifically said inline in their support requests (but I'm certain that is what they meant.)

It might be good to round up any system "commonalities" from users who have experienced this. Perhaps a slowvote poll?

I have 14.10 on a VM to play with in the meantime.

Particularly useful, then, would be to find these inlines in the database and figure out what happened to them.

They are stored in the table differential_transaction_comment. You can identify them most easily if someone can remember some of the text:

SELECT * FROM differential_transaction_comment WHERE content LIKE '%some text someone remembers%';

If no one remembers any text, here are all comments a user has ever made:

SELECT * FROM differential_transaction_comment WHERE authorPHID = 'their user PHID';

The inline comments should still be present in the database: the only pathway which deletes the rows is bin/remove destroy. Deleting inline comments from the web UI only sets the isDeleted flag.

Of particular interest are the isDeleted, changesetID, revisionPHID and transactionPHID values of any affected comment.

There's no possible way that PDF is about inline comments, I think? How do inline comments appear on different reviews which presumably have different files? Or did they happen to affect the same files?

Did inline comments jump from "example.c, line 123" in one revision to "example.c, line 123" in another revision? That's like extreme ultra spooky ghost levels of black magic.

When the comments move, how are users losing work? Maybe this is really two different bugs, or the losing-work users don't realize they're moving? But there's no way they can be moving like that?

How do you click on a comment and end up on a different revision?

Another user description of this issue:

Example: Rob's comments from today around 4:05 pm on diff #24852 appear on that diff's cr, D7683, but a version of them also appears on D7438, which he didn't even have open in a tab at the time. This is always obvious because when you click on a comment it opens a different cr,

Obviously these DXXX refer to our install.

Edit: I'm guessing you saw this already from Mike's pdf. Nvm.

I was able to reproduce some flavor of this for inline comments with Quicksand enabled (i.e., the persistent Conpherence chat sidebar open). That's probably T8852, but I expect it to be specific to Quicksand / client-side JS state and impossible to hit without pressing the \ key to open the chat column. I should be able to fix this, but I can't repro it without opening the column.

I spoke with rkeelan/Rob. Here is what he had to say..

Did inline comments jump from "example.c, line 123" in one revision to "example.c, line 123" in another revision? That's like extreme ultra spooky ghost levels of black magic.

Yes, this issue definitely refers to inline comments. In your example, the comments might be on example.c line 123, but they are ending up in some random diff that might not even contain example.c. They are like, dead floating inline comments. (He is about to send me a link to a revision currently in this state and I'll provide screenshots, etc.)

When the comments move, how are users losing work? Maybe this is really two different bugs, or the losing-work users don't realize they're moving? But there's no way they can be moving like that?

Most likely you're correct. They aren't just "losing" the comments, they are getting attached to some random diff and they don't know where they went. Rob indicated that usually he will stumble across the comments on some other diff later that day.

How do you click on a comment and end up on a different revision?

I think it's literally that simple. He said that either there will be an arrow by the comment (screenshot to come) and clicking on the comment will open some other revision. Or it will be a link that doesn't do anything similar to an href=#.

Is he 100% sure he has never touched the \ key on his keyboard to bring up the chat sidebar?

Can you define 'never'? As in, is this the kind of thing that if you accidentally open conpherence once you're forever banished to the land of AJAX bug hell? Or do you mean, at no time during his review process? This has been going on intermittently since February so I am confident all users haven't done this every time.

But if it's a switch that gets flipped once permanently, that seems possible.

It's not permanent, but this sequence would cause the problem:

  • Open the sidebar by accident.
  • Click a link to a different revision.
  • Close the sidebar.
  • Do comment stuff. This page is tainted because it was navigated to with the sidebar open, even though the sidebar is no longer open.

Assuming D16031 is a common element, you can identify all comments with inconsistent state like this:

SELECT i.*, r.phid correctRevisionPHID
  FROM differential_transaction_comment i
  JOIN differential_changeset c ON i.changesetID = c.id
  JOIN differential_diff d ON c.diffID = d.id
  JOIN differential_revision r ON d.revisionID = r.id
  WHERE i.revisionPHID != r.phid\G

If that list appears to be exhaustive (contains all inline comments anyone has ever lost track of), D16031 probably prevents this, even if there is a second set of reproduction steps which don't involve Quicksand. After D16031 you'll immediately get an error before typing anything, so it might be slightly frustrating but no work should be lost.

You may be able to fix the comments by setting the revisionPHID column of differential_transaction_comment to the value indicated by correctRevisionPHID in the result set. Proceed with caution if you pursue this since a slip of the finger could move every inline to one revision or something like that.

If that list doesn't appear to be exhaustive, finding the lost comments would be helpful in identifying whatever else is going on here.

One of our users was definitely encountering the chat bar issue.

Another (who has been reporting this issue since ~Feb) was confident-but-not-certain that he has encountered it without toggling the chat window. They are going to monitor closely and let me know if they experience this again (..and it looks like you just fixed the chat window bug, so that should reduce false positives if there is some other unknown bug.)

epriestley claimed this task.

Presuming this is resolved now, let us know if you're still seeing issues.

I have not heard any issues from uses since we deployed this fix so I would assume it is resolved as well. Thank you!