When Phabricator imports a repository for the first time (and in some other cases), the intended behavior is that import tasks are queued at a very low priority ("PRIORITY_IMPORT", currently the lowest priority available).
In some cases, import tasks are being queued at "PRIORITY_COMMIT" instead. This can interfere with import of other active repositories.
- See PHI1874. An install encountered import queue delays that led to tasks in the queue at the wrong priority.
- See PHI1953. An install encountered this issue while importing the Linux repository.
- See PHI1979. An install anticipates importing a large repository soon, and things would likely go more smoothly with this bug fixed.
- See also PHI1935, a tangential request for "pause a repository import".
---
These tasks are almost certainly being queued by the `setCloseFlagOnCommits(...)` pathway in `PhabricatorRepositoryRefEngine`. It's not entirely clear how this is being reached.
A natural mechanism would be to begin repository import with some refs marked as non-permanent, then later mark them as permanent before import completes. Commits reachable from those refs would be re-queued by `setCloseFlagOnCommits(...)`. However, there's no real reason to believe this occurred in PHI1874, PHI1953, or elsewhere.
Since a natural pathway exists anyway, a narrow fix to `setCloseFlagOnCommits(...)` is reasonable, but this may not be a complete fix. In particular, if a separate bug is causing publishable commits to be initially queued as unpublishable, that could result in repository imports which take ~2X longer to complete with no strong indicators that such a bug exists.
---
Possible actions:
- The code pathway in `setCloseFlagOnCommits(...)` should be unified with the pathway in `DiscoveryEngine->getImportTaskPriority()`.
- Is a bug where commits double-import reproducible?
- Does the `Change` step still need to occur for Git repositories?
- Make sure all commit tasks are inserting with `objectPHID` populated.
- Consider adding more source/context data to the message tasks (and possibly propagating it down the chain), since they now have multiple reachability pathways.
- Add a `containerPHID` to tasks and populate it with the repository PHID.
- Support bulk priority actions against tasks with particular container PHIDs.
- Update the daemon UI to break tasks down by object/container.