Page MenuHomePhabricator

Manual search reindex fails on "unable to load object by phid"
Closed, WontfixPublic

Description

I was manually reindexing the search index using bin/search index --all, which resulted in the following error at 99.7% progress:

[2014-10-28 09:45:29] PHLOG: 'Unable to build document PHID-WIKI-amenci5bzyxy5k4ndwwu with indexer PhrictionSearchIndexer.' at [/home/tools/phabricator/phabricator/src/applications/search/index/PhabricatorSearchDocumentIndexer.php:64]
[2014-10-28 09:45:29] EXCEPTION: (Exception) Unable to load object by phid 'PHID-WIKI-amenci5bzyxy5k4ndwwu'! at [<phabricator>/src/applications/search/index/PhabricatorSearchDocumentIndexer.php:28]
#0 PhabricatorSearchDocumentIndexer::loadDocumentByPHID(string) called at [<phabricator>/src/applications/phriction/search/PhrictionSearchIndexer.php:11]
#1 PhrictionSearchIndexer::buildAbstractDocumentByPHID(string) called at [<phabricator>/src/applications/search/index/PhabricatorSearchDocumentIndexer.php:35]
#2 PhabricatorSearchDocumentIndexer::indexDocumentByPHID(string) called at [<phabricator>/src/applications/search/index/PhabricatorSearchIndexer.php:21]
#3 PhabricatorSearchIndexer::indexDocumentByPHID(string) called at [<phabricator>/src/applications/search/worker/PhabricatorSearchWorker.php:10]
#4 PhabricatorSearchWorker::doWork() called at [<phabricator>/src/infrastructure/daemon/workers/PhabricatorWorker.php:110]
#5 PhabricatorWorker::scheduleTask(string, array, integer) called at [<phabricator>/src/applications/search/index/PhabricatorSearchIndexer.php:11]
#6 PhabricatorSearchIndexer::queueDocumentForIndexing(string) called at [<phabricator>/src/applications/search/management/PhabricatorSearchManagementIndexWorkflow.php:92]
#7 PhabricatorSearchManagementIndexWorkflow::execute(PhutilArgumentParser) called at [<phutil>/src/parser/argument/PhutilArgumentParser.php:394]
#8 PhutilArgumentParser::parseWorkflowsFull(array) called at [<phutil>/src/parser/argument/PhutilArgumentParser.php:290]
#9 PhutilArgumentParser::parseWorkflows(array) called at [<phabricator>/scripts/search/manage_search.php:21]

We've been running this phabricator install for about 6 months now and never have changed anything by hand, e.g. in the database. The problem persists when running the search indexer a second time.

Event Timeline

GMTA raised the priority of this task from to Needs Triage.
GMTA updated the task description. (Show Details)
GMTA added a subscriber: GMTA.

What makes you think that phid is valid?

Pertinent code snippet is here -- https://secure.phabricator.com/diffusion/P/browse/master/src/applications/search/index/PhabricatorSearchDocumentIndexer.php;d5b70e2c1cabed93d3c2c2d1de825adfee0c8106$22-31 -- this is a fairly vanilla "this phid doesn't exist" issue as far as I can tell...

I never said that that PHID is valid... Obviously it cannot be found. Somewhere, somehow, a stale reference seems to be left behind. I cannot find any documentation about how to fix an issue like this so that's why I think there's a bug somewhere in the wiki application?

Ah sorry, misread the report.

My guess is that the pertinent wiki document was deleted via ./bin/remove In any case it shouldn't be an error as you're seeing.

I am not sure if this means there is a bug in the LiskIterator stuff or not... I'll have to defer to @epriestley on that; that's some tricky stuff to me sitting on top of the automagical Lisk and after 15 minutes of looking just now I am no closer.

Thanks. I've just checked our shell history; no bin/remove was executed.

Huh, I didn't know that was possible. How does one check the definitive shell history of all users on a given host?

epriestley triaged this task as Wishlist priority.Oct 29 2014, 4:39 PM
  • This warning seems scary, but is super minor. We should probably tailor the output to make this more clear, or silence the warning entirely when running in batch (--all or --type) modes (showing it when indexing a single object probably still makes sense).
  • Usually, this is because some sub-object doesn't exist. In this case, when we load a wiki page we also load the content. Probably the content has gone missing somehow. We decline to load partial objects and fail with this error. Other cases are, e.g., a commit with a missing repository.
  • Running bin/remove destroy <phid> is usually a reasonable way to resolve the error. Ignoring it is also reasonable.
  • The root cause is probably an actual bug, but we don't have nearly enough information to figure out what happened by just seeing that something went missing (there's no way to reproduce the issue, and in most or all cases examining the database doesn't give us any more information). There are a handful of outstanding issues with Phriction, particularly around moves/renames and project integration, so whatever the root cause was will probably be fixed when we clean that stuff up.

I think the actions here are probably:

  • Improve or remove this error message (less scary / more helpful / maybe just silence it since it's not very informative or useful).
    • Maybe remove from batch modes.
    • Maybe add context in single-object mode?
  • Keep an eye out when we next touch Phriction for some bug which could leave documents without content.

In particular, giving a more detailed answer about why an object can't load seems complex and not useful. If this error said "The document is a PhrictionDocument, and its PhrictionContent object (PHID-PCTN-sdbasbda) does not exist.", it wouldn't prompt anyone to act any differently.

I can rewrite the error copy on this at some point.

Huh, I didn't know that was possible. How does one check the definitive shell history of all users on a given host?

We have single user access for our central phabricator install, whose sessions are explicitly logged in case of stuff like this :)

  • Running bin/remove destroy <phid> is usually a reasonable way to resolve the error. Ignoring it is also reasonable.

I've tried that just now:

tools@dev1:~/phabricator/phabricator/bin$ ./remove destroy PHID-WIKI-amenci5bzyxy5k4ndwwu
Usage Exception: No such object "PHID-WIKI-amenci5bzyxy5k4ndwwu" exists!

So I did a text search through all databases and found relations to this PHID in the following tables:

  • feed_storydata
  • feed_storyreference
  • phriction_document
  • search_document
  • search_documentfield
  • search_documentrelationship
  • worker_taskdata

Maybe this could help in finding the source of the bug. Thanks for the explanation, I'll just ignore it for now!

Maybe this pattern could help: this particular wiki document was moved ("This document was moved from elsewhere"). The second error I get on reindexing also refers to a wiki document that was moved.

epriestley claimed this task.

This hasn't seen action in two years and a lot of the surrounding code has changed since then. If it's still an issue, please file a new bug report with reproduction steps demonstrating how to break an object so it doesn't index properly.