Page MenuHomePhabricator

User-initiated search reindex tasks can end up stuck behind import tasks in the daemon queue
Closed, ResolvedPublic

Description

This is mostly for my benefit, but a Phacility instance managed to generate a repository which has an incorrect (stale / out of date) short name index, so the URI index and actual proper URI for the repository differ.

Event Timeline

epriestley lowered the priority of this task from Normal to Wishlist.Mar 21 2017, 1:56 PM

The cause of this was mundane: the daemon queue has a large repository import in it, so the repository search update got stuck behind the import tasks. Manually forcing an index update with bin/search index Rx fixed the index.

We should probably consider adjusting the priorities here. Index updates in response to real human edits could reasonably insert at PRIORITY_DEFAULT or PRIORITY_COMMIT (which could be renamed to something like PRIORITY_COMPLETE_SOMETHING_WE_STARTED, perhaps). I think the Editor reindex priority wasn't considered for adjustment after the changes in T11677.

epriestley renamed this task from An instance managed to generate an out-of-date URI index to User-initiated search reindex tasks can end up stuck behind import tasks in the daemon queue.Mar 21 2017, 3:28 PM
epriestley added projects: Search, Daemons, Phacility.

A related issue here is exemplified in https://discourse.phabricator-community.org/t/importing-libphutil-repository-on-fresh-phabricator-triggers-an-error/2391/, which basically amounts to:

  • Search indexing tasks are low priority and may be queued up for a while.
  • By the time we run one, it may be indexing an object which no longer exists (perhaps a user ran bin/remove destroy on the object).
  • In basically every case, this causes no actual negative effects (the user already destroyed the object), it just generates an error in the log.

We generate the error since it could be helpful in debugging, but it rarely is, and it's effectively never actionable or a clue toward an actual fix.

To make this more pointless, our first diagnostic step if a user reported "indexing isn't working" would be to run bin/search index ... anyway.

So I'm going to do this:

  • Failing to load an object for indexing on normal pathways is no longer an error: we assume the user bin/remove destroy'd the object already, or if they didn't, that there's no real value in reporting that an index rebuild failed unless they happen to also notice that the index itself actually isn't working when they expect it to.
  • bin/search index queues "Strict" tasks which still do report this error, so we can still catch it when we're looking for it explicitly.