Page MenuHomePhabricator

Fulltext indexing produces invalid JSON documents in Elasticsearch
Closed, ResolvedPublic


Indexing phabricator objects using "./bin/search index" creates JSON documents with trash in them.


/bin/search index --type PhabricatorUser

GET http://xxx:9200/phabricator/USER/PHID-USER-ciil6p5ve27rvlck2qsj


{"_index":"phabricator","_type":"USER","_id":"PHID-USER-ciil6p5ve27rvlck2qsj","_version":1,"found":true,"_source":{"title":"smith (Sam Smith)","url":"http:\/\/xxx:8066\/p\/smith\/","dateCreated":"1390566742","_timestamp":"1437979845","field":[{"type":"titl","corpus":"smith (Sam Smith)","aux":null}],"relationship":{"open":[{"phid":"PHID-USER-ciil6p5ve27rvlck2qsj","phidType":"USER","when":1452764234}]}}local:0}

Note that "local:0" at the end of the document. The trash is random: somethimes single letters, sometimes special characters. Almost every JSON document is corrupted this way which leads to not working search via ElasticSearch.

This error seems to be quite new.


:~/phabricator$ uname -a
Linux xxx 4.2.0-23-generic #28-Ubuntu SMP Sun Dec 27 17:47:31 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux


2e7f2b735702f84cdc9a7fb2167dda40dc47390c (Sat, Jan 9)


6833ae5bd33e86b5dbc8ee75221f778fc458b89c (Sat, Jan 9)


f5120574826088cba45c5ed4c2c05be4cbacbc86 (Sat, Jan 9)

Thanks for your help.

Event Timeline

What version of ElasticSearch are you using?

(Also: why are you using ElasticSearch instead of the default search?)

I am using elasticsearch:1.7.4 docker image. We thought using elasticsearch has advantages over mysql fulltext, is that wrong?

See some general discussion in T9893, although this doesn't seem to be an ElasticSearch 2.0 issue.

epriestley claimed this task.

Presuming this is either resolved by T9893/D17384 or no longer relevant. Follow up on T12450 or file a new task if you're still seeing issues.