Some followup SHOW INDEXES from the case in PHI47 provided output which didn't have any evidence of key cardinality issues, so it seems less likely that this is a cardinality problem.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Sep 21 2017
Sep 20 2017
Sep 15 2017
Sep 13 2017
In T12974#231875, @saggid wrote:Very intresting update. Now we can search in projects code, as i see? Thanks for this supir dupir project guys)
Very intresting update. Now we can search in projects code, as i see? Thanks for this supir dupir project guys)
Sep 12 2017
This is resolved by the Ferret engine, which can execute all parts of the query logic in MySQL.
This is resolved by the Ferret engine, using title:...:
This is resolved by the Ferret engine:
Sep 11 2017
kind of sad this isn't called pherret
The actual indexes are fairly large (3GB for ~500MB of task data, although lispum tasks have an exceptionally large amount of text).
Sep 8 2017
All the search which was previously driven by InnoDB FULLTEXT is now driven by the Ferret engine on this install.
Sep 7 2017
Here's a closer look at what's probably happening:
I briefly hit a bizarre case where a Ferret engine query took 10 seconds to find a document in 163 projects. However, running ANALYZE TABLE on the ngrams table resolved this completely. I suspect the ngrams join may require some tweaking (and maybe a bin/storage analyze). Analyzing the Maniphest table actually seems to improve performance by ~50% too (???) although that's not a very scientific measurement.
Sep 6 2017
Sep 5 2017
Unclear if building a search engine by just doing a lot of JOINs actually scales or not. Seems OK here.
Sep 1 2017
Aug 30 2017
A prototype of this is now available on this server (secure.phabricator.com). You can access it in Maniphest by using the Query (Prototype) field instead of the Contains Words field.
Whoops this is probably my bad :-/
Aug 29 2017
Aug 28 2017
Aug 17 2017
Aug 12 2017
Aug 4 2017
PHI27 is likely adjacent here.
Aug 2 2017
Jul 20 2017
Jul 19 2017
Jul 18 2017
For phabricator.wikimedia.org, we decided to go with elasticsearch, partially because we already have a massive elasticsearch cluster and a lot of institutional elasticsearch knowledge / experience. My opinion is that we made the right choice. I believe this opinion is shared by most of the folks who use our phabricator on a daily basis. I've seen zero complaints about search since we made the switch, which is a huge improvement from what I saw with mysql FTS. Conclusion: Elasticsearch seems to perform well and the results are generally better (obviously this is subjective, but like I said, no complaints from users)
Jun 28 2017
#dropbox has also encountered this since upgrading from MySQL 5.5 to MySQL 5.7, and thus an InnoDB FTS backend.
Jun 12 2017
Some other general thoughts:
Jun 10 2017
I suspect some of those features have no actual use case outside very marginal nonsense like T12799, but they're "things computers should be able to do".
I want to get clear of T12798 first, but there are some other features I'd like to add to our search anyway, including:
@20after4's ears were burning
:(
Jun 3 2017
Slightest of issues with the repository indexing mentioned above:
luca:~/phabricator$ ./bin/search index --type repository --force Usage Exception: Type "repository" matches multiple indexable objects. Use a more specific string. Matching object types are: PhabricatorRepository, PhabricatorRepositoryCommit. luca:~/phabricator$
May 22 2017
FWIW I partially implemented this in Wikimedia's fork, and I did so in a reusable way. I'd like to eventually upstream it but I'm not sure that my approach is desired upstream. I'll give it another shot though.