Page MenuHomePhabricator

Phabricator full text search not working with mysql/elasticsearch
Closed, InvalidPublic

Description

/maniphest/query

can not search task with chinese words

Now following was some chinese text test for search

中文测试,大河向东流,天上的星星不见了

Event Timeline

netroby raised the priority of this task from to Needs Triage.
netroby updated the task description. (Show Details)
netroby added a subscriber: netroby.

So the search for 中文 return empty result

https://secure.phabricator.com/maniphest/query/lCRKNYZ3Fa6O/#R

We don't use ElasticSearch on this install, but you can follow the directions here if you need that capability: T5282#62676

Sorry sir, if we do not using ElasticSearch engine, we should using Mysql instead.

SQL query should always working .

That isn't my understanding of fulltext search in CJK languages. Do you have information to the contrary?

pure SQL search in mysql is always working.

select * from table where subject like '%中文%' ;

but i don't know why phabricator search not working .

Can you make it working?

We don't perform full table scans for information in search. There are likely millions of rows of information to sort through, that is why an index and fulltext search is used.

You can read on the limitations here:
http://dev.mysql.com/doc/refman/5.1/en/fulltext-restrictions.html

The default search index adapter was mysql, right?
I tried using ./bin/search index --all to create index. but still can not search.

I also tried configure elasticsearch for phabricator, But it still not working as expect.
I search few keyword, the search result return all result, does not filter result with keyword.

So i assume, the pharbricator search does not working . even your official phabricator.com , can not searching

We don't use ElasticSearch on this install.

My configure elasticsearch well. and the index success created.

[netroby@localhost ~]$ http GET http://10.0.18.87:9200/phabricator/_search?q=消息
HTTP/1.1 200 OK
Content-Length: 8247
Content-Type: application/json; charset=UTF-8

{
    "_shards": {
        "failed": 0, 
        "successful": 5, 
        "total": 5
    }, 
    "hits": {
        "hits": [
            {
                "_id": "PHID-CMIT-4i7aevgfonq65ifkhws5", 
                "_index": "phabricator", 
                "_score": 2.0082064, 
                "_source": {
                    "_timestamp": "1407719810", 
                    "dateCreated": "1407719810", 
                    "field": [
                        {
                            "aux": null, 
                            "corpus": "rDYDEVf3d1bfd489c55a687cf76f5635aa00d6cb9bf0a1 短消息查看过滤 系统短消息", 
                            "type": "titl"
                        }, 
                        {
                            "aux": null, 
                            "corpus": "短消息查看过滤 系统短消息\n\n\n", 
                            "type": "body"
                        }
                    ], 
                    "relationship": {
                        "auth": [
                            {
                                "phid": "PHID-USER-7ewxgzf6o4lm3t42qamr", 
                                "phidType": "USER", 
                                "when": "1407719810"
                            }
                        ], 
                        "repo": [
                            {
                                "phid": "PHID-REPO-ppkc6mscbilkz6er7frl", 
                                "phidType": "REPO", 
                                "when": "1407719810"
                            }
                        ]
                    }, 
                    "title": "rDYDEVf3d1bfd489c55a687cf76f5635aa00d6cb9bf0a1 短消息查看过滤 系统短消息", 
                    "url": "http://corebase.info/rDYDEVf3d1bfd489c55a687cf76f5635aa00d6cb9bf0a1"
                }, 
                "_type": "CMIT"
            }, 
            {
                "_id": "PHID-CMIT-tvqrjjy6kxfpkniblrrb", 
                "_index": "phabricator", 
                "_score": 1.7807932, 
                "_source": {
                    "_timestamp": "1421216516", 
                    "dateCreated": "1421216516", 
                    "field": [
                        {
                            "aux": null, 
                            "corpus": "rDEYIHOMEPHPSOURCEc5f60185c99efed76c6eda2534aef5c7ea2c8e46 完善群发消息", 
                            "type": "titl"
                        }, 
                        {
                            "aux": null, 
                            "corpus": "完善群发消息\n\n\n", 
                            "type": "body"
                        }
                    ], 
                    "relationship": {
                        "auth": [
                            {
                                "phid": "PHID-USER-ibyghww6rvsyrlud67vi", 
                                "phidType": "USER", 
                                "when": "1421216516"
                            }
                        ], 
                        "repo": [
                            {
                                "phid": "PHID-REPO-y4lsgogjsqhij2u5yiiq", 
                                "phidType": "REPO", 
                                "when": "1421216516"
                            }
                        ]
                    }, 
                    "title": "rDEYIHOMEPHPSOURCEc5f60185c99efed76c6eda2534aef5c7ea2c8e46 完善群发消息", 
                    "url": "http://corebase.info/rDEYIHOMEPHPSOURCEc5f60185c99efed76c6eda2534aef5c7ea2c8e46"
                }, 
                "_type": "CMIT"
            }, 
            {
                "_id": "PHID-CMIT-wekbfil4iprmd3ijqjgd", 
                "_index": "phabricator", 
                "_score": 1.7521847, 
                "_source": {
                    "_timestamp": "1420451477", 
                    "dateCreated": "1420451477", 
                    "field": [
                        {
                            "aux": null, 
                            "corpus": "rDEYIHOMEPHPSOURCEd35427f93f4b0f08717e189e2908122ec8a849a2 添加群消息发送", 
                            "type": "titl"
                        }, 
                        {
                            "aux": null, 
                            "corpus": "添加群消息发送\n\n\n", 
                            "type": "body"
                        }
                    ], 
                    "relationship": {
                        "auth": [
                            {
                                "phid": "PHID-USER-ibyghww6rvsyrlud67vi", 
                                "phidType": "USER", 
                                "when": "1420451477"
                            }
                        ], 
                        "repo": [
                            {
                                "phid": "PHID-REPO-y4lsgogjsqhij2u5yiiq", 
                                "phidType": "REPO", 
                                "when": "1420451477"
                            }
                        ]
                    }, 
                    "title": "rDEYIHOMEPHPSOURCEd35427f93f4b0f08717e189e2908122ec8a849a2 添加群消息发送", 
                    "url": "http://corebase.info/rDEYIHOMEPHPSOURCEd35427f93f4b0f08717e189e2908122ec8a849a2"
                }, 
                "_type": "CMIT"
            }, 
            {
                "_id": "PHID-CMIT-nhfmgm3f2qukqib4liue", 
                "_index": "phabricator", 
                "_score": 1.7521847, 
                "_source": {
                    "_timestamp": "1407143275", 
                    "dateCreated": "1407143275", 
                    "field": [
                        {
                            "aux": null, 
                            "corpus": "rDYDEV438e3389318f30fb62a87871a14cef3da10f7225 短消息 数量纠正", 
                            "type": "titl"
                        }, 
                        {
                            "aux": null, 
                            "corpus": "短消息 数量纠正\n\n\n", 
                            "type": "body"
                        }
                    ], 
                    "relationship": {
                        "auth": [
                            {
                                "phid": "PHID-USER-7ewxgzf6o4lm3t42qamr", 
                                "phidType": "USER", 
                                "when": "1407143275"
                            }
                        ], 
                        "repo": [
                            {
                                "phid": "PHID-REPO-ppkc6mscbilkz6er7frl", 
                                "phidType": "REPO", 
                                "when": "1407143275"
                            }
                        ]
                    }, 
                    "title": "rDYDEV438e3389318f30fb62a87871a14cef3da10f7225 短消息 数量纠正", 
                    "url": "http://corebase.info/rDYDEV438e3389318f30fb62a87871a14cef3da10f7225"
                }, 
                "_type": "CMIT"
            }, 
            {
                "_id": "PHID-CMIT-tmukqxslynmbstqmzlyh", 
                "_index": "phabricator", 
                "_score": 1.7521847, 
                "_source": {
                    "_timestamp": "1406276995", 
                    "dateCreated": "1406276995", 
                    "field": [
                        {
                            "aux": null, 
                            "corpus": "rDYDEV2111b27610950236a3500e610292fbd73746e014 回复短消息", 
                            "type": "titl"
                        }, 
                        {
                            "aux": null, 
                            "corpus": "回复短消息\n\n\n", 
                            "type": "body"
                        }
                    ], 
                    "relationship": {
                        "auth": [
                            {
                                "phid": "PHID-USER-7ewxgzf6o4lm3t42qamr", 
                                "phidType": "USER", 
                                "when": "1406276995"
                            }
                        ], 
                        "repo": [
                            {
                                "phid": "PHID-REPO-ppkc6mscbilkz6er7frl", 
                                "phidType": "REPO", 
                                "when": "1406276995"
                            }
                        ]
                    }, 
                    "title": "rDYDEV2111b27610950236a3500e610292fbd73746e014 回复短消息", 
                    "url": "http://corebase.info/rDYDEV2111b27610950236a3500e610292fbd73746e014"
                }, 
                "_type": "CMIT"
            }, 
            {
                "_id": "PHID-CMIT-famd7pcfvpkulvyyk4va", 
                "_index": "phabricator", 
                "_score": 1.7309508, 
                "_source": {
                    "_timestamp": "1409105894", 
                    "dateCreated": "1409105894", 
                    "field": [
                        {
                            "aux": null, 
                            "corpus": "rDYDEV9fed535e6ef5f54f788ad0866a9df4d12ecd5f2d 屏蔽 短消息", 
                            "type": "titl"
                        }, 
                        {
                            "aux": null, 
                            "corpus": "屏蔽 短消息\n\n\n", 
                            "type": "body"
                        }
                    ], 
                    "relationship": {
                        "auth": [
                            {
                                "phid": "PHID-USER-7ewxgzf6o4lm3t42qamr", 
                                "phidType": "USER", 
                                "when": "1409105894"
                            }
                        ], 
                        "repo": [
                            {
                                "phid": "PHID-REPO-ppkc6mscbilkz6er7frl", 
                                "phidType": "REPO", 
                                "when": "1409105894"
                            }
                        ]
                    }, 
                    "title": "rDYDEV9fed535e6ef5f54f788ad0866a9df4d12ecd5f2d 屏蔽 短消息", 
                    "url": "http://corebase.info/rDYDEV9fed535e6ef5f54f788ad0866a9df4d12ecd5f2d"
                }, 
                "_type": "CMIT"
            }, 
            {
                "_id": "PHID-CMIT-llundjq2s6rxl6c5sxoi", 
                "_index": "phabricator", 
                "_score": 1.7135541, 
                "_source": {
                    "_timestamp": "1341906692", 
                    "dateCreated": "1341906692", 
                    "field": [
                        {
                            "aux": null, 
                            "corpus": "rDYDEVeeda878707d7db8dbe2be5660075faee32dbb6ee 企业中心:消息中心模版修改,消息工具条位置移动", 
                            "type": "titl"
                        }, 
                        {
                            "aux": null, 
                            "corpus": "企业中心:消息中心模版修改,消息工具条位置移动\n\n\n", 
                            "type": "body"
                        }
                    ], 
                    "relationship": {
                        "auth": [
                            {
                                "phid": "PHID-USER-h3thoo2hlpbsbd5e7ce4", 
                                "phidType": "USER", 
                                "when": "1341906692"
                            }
                        ], 
                        "repo": [
                            {
                                "phid": "PHID-REPO-ppkc6mscbilkz6er7frl", 
                                "phidType": "REPO", 
                                "when": "1341906692"
                            }
                        ]
                    }, 
                    "title": "rDYDEVeeda878707d7db8dbe2be5660075faee32dbb6ee 企业中心:消息中心模版修改,消息工具条位置移动", 
                    "url": "http://corebase.info/rDYDEVeeda878707d7db8dbe2be5660075faee32dbb6ee"
                }, 
                "_type": "CMIT"
            }, 
            {
                "_id": "PHID-CMIT-pveyhqhhcbpaossnlm7s", 
                "_index": "phabricator", 
                "_score": 1.4247687, 
                "_source": {
                    "_timestamp": "1417657224", 
                    "dateCreated": "1417657224", 
                    "field": [
                        {
                            "aux": null, 
                            "corpus": "rDEYIHOMEPHPSOURCEb88a07b0e435541c0c7e8699bea3d0260a3c37ee 数据表添加:短消息记录", 
                            "type": "titl"
                        }, 
                        {
                            "aux": null, 
                            "corpus": "数据表添加:短消息记录\n\n\n", 
                            "type": "body"
                        }
                    ], 
                    "relationship": {
                        "repo": [
                            {
                                "phid": "PHID-REPO-y4lsgogjsqhij2u5yiiq", 
                                "phidType": "REPO", 
                                "when": "1417657224"
                            }
                        ]
                    }, 
                    "title": "rDEYIHOMEPHPSOURCEb88a07b0e435541c0c7e8699bea3d0260a3c37ee 数据表添加:短消息记录", 
                    "url": "http://corebase.info/rDEYIHOMEPHPSOURCEb88a07b0e435541c0c7e8699bea3d0260a3c37ee"
                }, 
                "_type": "CMIT"
            }, 
            {
                "_id": "PHID-CMIT-ftc76kyx6dwsl6pay2oc", 
                "_index": "phabricator", 
                "_score": 1.4247687, 
                "_source": {
                    "_timestamp": "1325671278", 
                    "dateCreated": "1325671278", 
                    "field": [
                        {
                            "aux": null, 
                            "corpus": "rDYDEVa5c928598d8d485e32f195fbeae27971d3a1d77e 新增模版,消息页面", 
                            "type": "titl"
                        }, 
                        {
                            "aux": null, 
                            "corpus": "新增模版,消息页面\n\n\n", 
                            "type": "body"
                        }
                    ], 
                    "relationship": {
                        "auth": [
                            {
                                "phid": "PHID-USER-h3thoo2hlpbsbd5e7ce4", 
                                "phidType": "USER", 
                                "when": "1325671278"
                            }
                        ], 
                        "repo": [
                            {
                                "phid": "PHID-REPO-ppkc6mscbilkz6er7frl", 
                                "phidType": "REPO", 
                                "when": "1325671278"
                            }
                        ]
                    }, 
                    "title": "rDYDEVa5c928598d8d485e32f195fbeae27971d3a1d77e 新增模版,消息页面", 
                    "url": "http://corebase.info/rDYDEVa5c928598d8d485e32f195fbeae27971d3a1d77e"
                }, 
                "_type": "CMIT"
            }, 
            {
                "_id": "PHID-CMIT-2mg2l27w4h6wkgxgfgts", 
                "_index": "phabricator", 
                "_score": 1.4247687, 
                "_source": {
                    "_timestamp": "1405058434", 
                    "dateCreated": "1405058434", 
                    "field": [
                        {
                            "aux": null, 
                            "corpus": "rDYDEV0db2041526e7dda16ae04cd02bd353b56bfafcdd wap  个人中心  短消息与提醒", 
                            "type": "titl"
                        }, 
                        {
                            "aux": null, 
                            "corpus": "wap  个人中心  短消息与提醒\n\n\n", 
                            "type": "body"
                        }
                    ], 
                    "relationship": {
                        "auth": [
                            {
                                "phid": "PHID-USER-7ewxgzf6o4lm3t42qamr", 
                                "phidType": "USER", 
                                "when": "1405058434"
                            }
                        ], 
                        "repo": [
                            {
                                "phid": "PHID-REPO-ppkc6mscbilkz6er7frl", 
                                "phidType": "REPO", 
                                "when": "1405058434"
                            }
                        ]
                    }, 
                    "title": "rDYDEV0db2041526e7dda16ae04cd02bd353b56bfafcdd wap  个人中心  短消息与提醒", 
                    "url": "http://corebase.info/rDYDEV0db2041526e7dda16ae04cd02bd353b56bfafcdd"
                }, 
                "_type": "CMIT"
            }
        ], 
        "max_score": 2.0082064, 
        "total": 356
    }, 
    "timed_out": false, 
    "took": 8
}

But the phabricator does not return the right result.

netroby renamed this task from Can not search task with chinese words to Phabricator full text search not working with mysql/elasticsearch .Apr 14 2015, 3:47 AM
netroby reopened this task as Open.

This is not Duplicate issue. Search not working with mysql, nor elasticsearch.

Not about mysql, elasticsearch. Pharbricator can not search from mysql/elasticsearch backend.

chad claimed this task.

We'd prefer if you found a new issue, to file a new task. Please don't rename and reopen the original ticket, it makes organization and finding conversations difficult.

If you are having issues with Elasticsearch, please check out the other known issues first. Maybe T6892 is related? If your issue isn't listed, please provide full steps to reproduce the issue (https://secure.phabricator.com/book/phabcontrib/article/bug_reports/).

The problem is not about elasticsearch, nor mysql. It's about pharbricator search [backend is ok, and return data right if you search with right way]

I can reproduce on your official phabricator.
At this issue body:

/maniphest/query

can not search task with chinese words

Now following was some chinese text test for search

中文测试,大河向东流,天上的星星不见了

So the search for 中文 return empty result

https://secure.phabricator.com/maniphest/query/lCRKNYZ3Fa6O/#R

secure.phabricator.com is not expected to return results for CJK languages. It is not configured to.

I feel like I'm not understanding your question, since I feel I keep answering it, but you just ask it again. The bug you filed was for this installation of Phabricator not returning expected results in Chinese. That is because this installation of Phabricator which is secure.phabricator.com uses a fulltext search (myisam, mysql) and those languages are not supported. We don't personally have need to support those searches on this installation of Phabricator.

If your question is about searching Chinese on your local installation then you can either configure ElasticSearch, or there are various workarounds with MySQL if you take the time to research and Google them.

If your question is about searching Chinese with ElasticSearch on your local installation and it's returning poor results, that may be an issue with ElasticSearch, the analyzer they use, how you've configured it, or maybe it is a Phabricator issue, but we'd need full steps to reproduce if that's the bug you're seeing.

@chad
I configure my Phabricator instance's MySQL with ngram fulltext parser and I can search CJK with mysql fulltext engine.

I don't use ElasticSearch and I use the built-in ngram parser of MySQL.

First, set mysql variable ngram_token_size=2

Then, alter Phabricator MySQL table as follows

use phabricator_search;
drop index corpus on search_documentfield;
create fulltext index `corpus` on search_documentfield (corpus) with parser ngram;

Hopefully, Phabricator can give us an option to choose fulltext parser during installation.