Page MenuHomePhabricator

Avoid or easily handle 'duplicate' audits
Closed, ResolvedPublic

Description

Several times now, we've had a situation where audits are re-sent due to a rebase. Basically, what seems to happen is that someone rebases on top of an old commit -- by mistake -- and when they push, any audits that were rebased get re-opened, since the sha now differs.

Here's a report how one user got into this state. I'm not sure I totally understand it, but hopefully you guys will. :-)

I have a feature branch (local+remote) that I'm creating diffs from. I went to land one and got "Usage Exception: There are multiple revisions on feature branch 'myfeaturebranch' which are not present in 'parentbranch'" with a list of all of the diffs I had done off of parentbranch. My guess is that happened because at one point I renamed parentbranch from its original name. I then switched to parentbranch and did "git pull featurebranch." Then I did "git push origin parentbranch" but for some reason it said I was behind and needed to pull. I did "git pull origin parentbranch" and it pulled in every commit that I authored in parentbranch or had been rebased into parentbranch. I'm pretty sure that what happened was that it made new commits with me as the committer even for things that I didn't author. I should have stopped there but I did "git push origin featurebranch" which kicked off new duplicate audits. It also caused me to close someone elss's open revisions.

The ideal solution would be to not create the duplicate audits in this scenario, but that may be difficult. A 90% solution would be to maybe make it easy to identify audits where the author and committer differ, and automatically close them all? Something that was like "Find all audits where the author and committer differ and were created less than X minutes ago." would be a useful search, especially if combined with an 'accept all' button or some such. Then we'd just point people to that whenever we notice a mistaken rebase+push like this.

Event Timeline

csilvers updated the task description. (Show Details)
csilvers added a project: Audit.
csilvers added a subscriber: csilvers.
chad changed the visibility from "All Users" to "Public (No Login Required)".Jul 3 2015, 3:59 AM
eadler added a project: Restricted Project.Aug 7 2016, 8:02 PM
epriestley claimed this task.

My merge of T6557 here was questionable, but I think this is a case where the audits are basically correct and the remedy is better cleanup tools.

Since this was filed, bin/audit delete has gotten fancier, and I think it can pretty reasonably handle this situation now (we also haven't seen other reports of this particular mishap occurring that I can recall). For example, you could do something like this:

$ ./bin/audit delete --dry-run --repositories rLOCKS --min-commit-date 2016-11-01
     98007 hector           Audit Required   rLOCKSa63ed08bd2f6: Merge branch 'master' into feature
     98005 hector           Audit Required   rLOCKS67f50d8bb067: create master
     97799 hector           Audit Required   rLOCKSb8674c6bf0e8: asdf
     97541 O38              Audit Required   rLOCKS3fe3c3444fc8: Found another loud duck.
     97542 hector           Audit Required   rLOCKS3fe3c3444fc8: Found another loud duck.
     97537 O38              Audit Required   rLOCKSa909f2ecb79d: quux
     97540 hector           Audit Required   rLOCKSa909f2ecb79d: quux
     97536 O38              Audit Required   rLOCKS256c7371223d: Found a loud duck.
     97539 hector           Audit Required   rLOCKS256c7371223d: Found a loud duck.
     97535 O38              Audit Required   rLOCKS0e78cd01a724: Found a loud duck.
     97538 hector           Audit Required   rLOCKS0e78cd01a724: Found a loud duck.
     97533 O38              Audit Required   rLOCKSc410854ca761: Found a loud duck.
     97534 hector           Audit Required   rLOCKSc410854ca761: Found a loud duck.
     97530 O38              Audit Required   rLOCKSe0630d2d983f: Found a loud duck.
     97532 hector           Audit Required   rLOCKSe0630d2d983f: Found a loud duck.
     97531 hector           Audit Required   rLOCKS250045230378: wip
     97529 hector           Audit Required   rLOCKS5376755114b9: wip
     97528 hector           Audit Required   rLOCKS97356eff41da: WIP

Then either manually prune anything legitimate out of that list (if all the problematic audits are more or less in a contiguous block) or filter it against a list of commits where the author and committer differ from git log, and finally pipe the IDs back into bin/audit delete --ids.

The proposed heuristic (different author and committer) feels pretty shaky, since if you rebase some of your own commits they'll have the same author and committer. I would guess this is somewhat-common in mis-rebases from master? I'm conceptually open to adding more flags to bin/audit delete but it seems like we're at least in the ballpark of having a reasonable way to respond to this now. If we added more stuff here, I'd rather try to find more general constraints we could add (like --hashes or --branches?) than --probably-a-rebase-because-the-author-and-committer-differ or whatever.

Another attack on this which isn't hugely viable today but may be by the time anyone else reads this is using diffusion.commit.search to identify audits, manipulating them in JSON freely using whatever tools you prefer, then piping the IDs into bin/audit delete.

The proposed heuristic (different author and committer) feels pretty shaky, since if you rebase some of your own commits they'll have the same author and committer. I would guess this is somewhat-common in mis-rebases from master?

In our case, this problem situation happens when new employees are misconfigured to rebase instead of merge. They tend not to have existing audits at that point. (Or if so, very few.) Once they fix up their configs the problem doesn't typically happen again.

That said, I'm happy to do the comparing myself. Next time this occurs I'll see if I can write up a script using --dry-run and a bunch of fancy git-fu.