Placeholder for now, but we've seen two of these so this is likely a real thing.
Description
Revisions and Commits
Status | Assigned | Task | ||
---|---|---|---|---|
Resolved | epriestley | T12974 Upgrading: "Ferret" Fulltext Engine | ||
Resolved | epriestley | T12819 InnoDB FULLTEXT appears to fail catastrophically once it reaches a moderate size | ||
Resolved | epriestley | T12928 Tokens in the form "v0.1" are not handled well by the MySQL FULLTEXT index |
Event Timeline
Unclear if building a search engine by just doing a lot of JOINs actually scales or not. Seems OK here.
I generated and indexed 65,000 tasks locally with bin/lipsum. The lipsum corpus is very small (approximately 60 "words") so this "should" be a more stressful test of the index than real data, because almost every task has the same words and the same ngrams (that is, the ngram index isn't very useful for distinguishing between tasks). The query "lorem" matches 28,000 tasks and resolves in ~5-10s for me locally. This drops to ~0.01s if I remove the ORDER constraints and just let MySQL return whatever part of the result set it wants to. This does a pile of I/O but memory doesn't do anything crazy, which is consistent with spending time doing a big grep over the primary index while processing the LIKE constraints, which is what I was hoping for in the worst case.
For uncommon words (e.g., any word not part of the lipsum corpus) the ngram index is effective and results are instantaneous. For sufficiently constrained queries (e.g., adding a priority and a tag so we hit a few hundred results instead of 28,000) the indexes seem to get used sensibly and results are also instantaneous.
The actual indexes are fairly large (3GB for ~500MB of task data, although lispum tasks have an exceptionally large amount of text).
None of this is perfect, but it all seems better than InnoDB fulltext in terms of having mostly reasonable behaviors and not completely exploding for no reason.
I briefly hit a bizarre case where a Ferret engine query took 10 seconds to find a document in 163 projects. However, running ANALYZE TABLE on the ngrams table resolved this completely. I suspect the ngrams join may require some tweaking (and maybe a bin/storage analyze). Analyzing the Maniphest table actually seems to improve performance by ~50% too (???) although that's not a very scientific measurement.
Here's a closer look at what's probably happening:
mysql> show indexes from phriction_document_fngrams; +----------------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +----------------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | phriction_document_fngrams | 0 | PRIMARY | 1 | id | A | 3560 | NULL | NULL | | BTREE | | | | phriction_document_fngrams | 1 | key_ngram | 1 | ngram | A | 3560 | NULL | NULL | | BTREE | | | | phriction_document_fngrams | 1 | key_ngram | 2 | documentID | A | 3560 | NULL | NULL | | BTREE | | | | phriction_document_fngrams | 1 | key_object | 1 | documentID | A | 3560 | NULL | NULL | | BTREE | | | +----------------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 4 rows in set (0.00 sec) mysql> analyze table phriction_document_fngrams; +--------------------------------------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +--------------------------------------------+---------+----------+----------+ | local_phriction.phriction_document_fngrams | analyze | status | OK | +--------------------------------------------+---------+----------+----------+ 1 row in set (0.01 sec) mysql> show indexes from phriction_document_fngrams; +----------------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +----------------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | phriction_document_fngrams | 0 | PRIMARY | 1 | id | A | 3560 | NULL | NULL | | BTREE | | | | phriction_document_fngrams | 1 | key_ngram | 1 | ngram | A | 3560 | NULL | NULL | | BTREE | | | | phriction_document_fngrams | 1 | key_ngram | 2 | documentID | A | 3560 | NULL | NULL | | BTREE | | | | phriction_document_fngrams | 1 | key_object | 1 | documentID | A | 46 | NULL | NULL | | BTREE | | | +----------------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 4 rows in set (0.00 sec)
Note that the cardinality for the documentID row drops from 3560 (inaccurately estimating a unique value for each row) to 46 (more accurately estimating a unique value for each document) after analysis.
All the search which was previously driven by InnoDB FULLTEXT is now driven by the Ferret engine on this install.
The actual indexes are fairly large (3GB for ~500MB of task data, although lispum tasks have an exceptionally large amount of text).
D18580 improved this somewhat, but the tables are still large. On this install, 319MB of task data produces 1329MB of index data, i.e. the indexes are about 4x larger than the tasks.
This is generally fine -- trading away disk space, which is cheap, to get an index which doesn't explode all the time is broadly desirable (that is, any time we can trade disk space for basically any other benefit, we're probably eager to do so).
This data is also extremely compressible and the backup size increases by a much smaller amount (3.6G before Ferret engine indexing to 3.9GB afterwards on this install). Much of the data is indexes, not actually data, and constructed implicitly by MySQL and thus not present explicitly in the backup at all. The largest tables (like the changeset data) also aren't indexed and don't expand, so even at runtime this far from a 4x increase in size. Finally, we'll be able to drop the old InnoDB tables later which will reclaim some space, although these are not exceptionally large.
However, this presents a possible deployment problem for the Phacility cluster, as some of the older shards may not have the free space to accommodate this increase in data size (see T12932). We're also already populating these indexes for new objects, and thus rocketing toward our doom in some sense.
I expect to complete T12932 this week since that's the lowest hanging change, but the actual deployment schedule for the Ferret engine may end up being complicated by this: I don't want to unprototype it if the cluster isn't prepared to upgrade into it. We have two well-defined pathways available to resolve this by increasing the available space (either moving instances to separate shards as in T12817, or upgrading the volume) but both involve a meaningful amount of downtime for instances on those shards so I'd prefer to find less disruptive solutions if they're reasonably available.