Page MenuHomePhabricator

Write a script to discover orphaned PHIDs
Closed, WontfixPublic

Description

Every now and then I come across some borked data on our Phabricator install. Usually, this is a reference to some PHID which no longer exists, although I also found that we had a large number of commits (phabricator_repository.repository_commit) which didn't belong to any Phabricator repository. Some examples are as follows:

Whilst most of the time, this sort of data is not harmful in any way, in many cases it bothers me to the extend that I write some custom PHP or SQL to find these things and then fix them (usually by deleting them). I imagine it would be difficult to catch everything, but I would see benefit in some kind of ./bin/cleanup script that could at least list orphaned PHIDs and maybe (in some cases) delete them as well.

Event Timeline

joshuaspence assigned this task to epriestley.
joshuaspence raised the priority of this task from to Wishlist.
joshuaspence updated the task description. (Show Details)
joshuaspence added a project: Phabricator.
joshuaspence added a subscriber: joshuaspence.
eadler added a project: Restricted Project.Jan 8 2016, 11:11 PM
eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.

I think it's very unlikely that we'll ever write some kind of script to just fix this stuff. I suspect such a script is impossible to write (i.e., the "correct" value for most fields is ambiguous) and that maintaining it is very difficult.

Few installs encounter these issues, and those that do almost always encounter them after running bin/remove destroy and confirming their way through a huge red skull ASCII art warning them not to press forward (in fairness, the art did not exist at the time this task was originally filed).

This script could simply detect these PHIDs, but few users would be able to do anything useful with that information.

Of the examples:

  • T5339 probably doesn't happen at HEAD anymore (bin/remove destroy is more thorough today) and caused no actual problems, it seems.
  • T4750 (rewriting history) is explicitly not something we want to support in the upstream.
  • T4733 was a one-time issue that was later fixed by a migration (D14825 at the latest, maybe earlier?).
  • T4718 is explicitly not something we want to support in the upstream (arbitrary manual modification of table data, not clear exactly what it was fixing).
  • T4710 doesn't really hurt anything.
  • T4745 is a consequence of destroying a repository.

T8830 has some more discussion of bin/remove destroy. In general, this command is not bin/time rewind and never will be. We do not expect to ever provide a command which completely rewinds time on an object, and strongly discourage destroying objects like repositories. bin/remove destroy has gained more and more severe warnings to underscore this over time. I'm not sure what the original motivation for destroying repositories was in these cases.

When you do destroy a repository or user, we expect this to leave dangling edges in other applications (for example, activity by the user will show "Unknown User changed the title from X to Y"). I don't consider this harmful and never expect to provide tools to remove it (and it legitimately reflects history, and disabling the user instead of destroying them will leave it readable). It's possible other similar links may eventually be cleaned up in some cases, but I don't really have any plans to pursue this.

In the specific case where a destroyed object was part of a policy, we have no option but to lock the object down. You can bin/policy unlock it. I generally think you shouldn't be destroying objects which are actively involved in policies for other objects you care about, and that our behavior is the only reasonable one (other than, perhaps, querying every policy for every object and preventing the user from moving forward if the object is in use anywhere).

If anything, I'd be inclined to perhaps remove bin/remove destroy from the upstream so that it's explicitly clear that you're voiding the warranty when you destroy things.

I think the general tension is:

  • Users want bin/time rewind, to fully undo all consequences of mistakes, errors, or changes in decisions.
  • This is impossible to write in the general case, and unreasonably difficult and time consuming to write in the "best possible" case.
  • While looking for bin/time rewind, users find bin/remove destroy and run it, confirming the red skull and warnings but not really accepting them.
  • Users are disappointed that bin/remove destroy does not reach through time to undo a decision (like a repository configuration or user account) made a year ago.

That is, I think this is mostly a user expectation problem (users want an impossible behavior, and hope bin/remove destroy implements that impossible behavior, and never hit a hard wall in the workflow that forces them to reframe expectations), not a technical problem with the behavior bin/remove destroy actually implements or support tooling.

T8830 is a reasonable discussion of the expectation problem. The kinds of changes we might make to address it include:

  • Remove bin/remove destroy, to create a hard wall where you're forced to reframe your expectations (i.e., to invoke DestructionEngine you have to copy/paste some script off StackOverflow).
  • Implement bin/time rewind and document it as rewinding time. When run, it tells you the operation is impossible and suggests bin/remove destroy as a different operation with a very limited scope.