Page MenuHomePhabricator

Upgrading: "Ferret" Fulltext Engine
Closed, ResolvedPublic

Assigned To
Authored By
epriestley
Sep 1 2017, 5:05 PM
Referenced Files
F5208739: Catch.jpg
Oct 3 2017, 10:15 AM
F5208736: Catch.jpg
Oct 3 2017, 10:14 AM
F5161583: Screen Shot 2017-09-01 at 9.58.36 AM.png
Sep 1 2017, 5:05 PM
Tokens
"Like" token, awarded by benwick."Mountain of Wealth" token, awarded by ftdysa."Like" token, awarded by tomekj2ee."100" token, awarded by chad.

Description

Starting with 2017 Week 37, a new fulltext search engine (the "Ferret" engine) has replaced the older MySQL FULLTEXT engines.

Operations Impact

  • After upgrading, search indexes must be rebuilt.
  • In some cases, Phabricator may require significantly more database disk space than before.
  • Some fulltext-related features have changed in a way that is not backwards compatible, and may need adjustment to upgrade to the newer features.
  • See below for more details and discussion.

New Features and Motivation

These are new features the engine adds:

  • (T6721) You can search a particular field, like task titles, with title:platypus.
  • You can search for a substring with the ~ operator, like ~latypu.

These are the bugs and issues this engine aims to address:

  • (T12819) InnoDB FULLTEXT scalability issues. This was the largest single motivation for these changes. See this task for more details and technical discussion.
  • (T12928) No support for very short terms like "v0.1".
  • (T12443) Searches where more than 1,000 documents matched the query terms could return too few results.
  • Searches could fail to find documents if some terms were only in the title while others were only in the description.

Upgrading

This engine has replaced the older engines, and search indexes must be rebuilt. The UI should notify you about this. See T11932 for details and guidance on rebuilding the index.

Note that the Ferret engine requires significantly more disk space for indexes than the older fulltext engines did: in extreme cases, total database storage may expand by 400% when the index is rebuild (for example, from 10GB to 50GB). For most installs with non-pathological data, a 25%-50% increase is probably a much better estimate. Make sure you have a comfortable amount of free space before upgrading. (Some of this space will be reclaimed in the future, but for now we are also retaining the older indexes in case we need to revert parts of this.)

Compatibility Breaks

The new Ferret engine fulltext fields have replaced some older similar fields. The new fields are more powerful, so it didn't make sense to retain the old fields. Specifically:

  • The "fullText" parameter to maniphest.query is no longer supported. Use maniphest.search with the "query" constraint instead.
  • The "Contains Words" field in Maniphest has been replaced with the new "Query" field. Saved searches which used a "Contains Words" constraint may need to be updated.
  • The "Name Contains" field in Diffusion repository search has been replaced with the new "Query" field. Saved searches which used a "Name Contains" constraint may need to be updated. Note that the "Query" field searches descriptions (not just repository names), so you may need to use title:... to closely replicate the old field behavior.

In followup changes, some other "Name Contains" fields (or other fields with similar behavior) may also be replaced with "Query" fields.

Related Objects

Event Timeline

chad added a subscriber: chad.

kind of sad this isn't called pherret

epriestley renamed this task from Prototype: "Ferret" Fulltext Engine to Upgrading: "Ferret" Fulltext Engine.Sep 12 2017, 4:14 PM
epriestley updated the task description. (Show Details)
epriestley removed a subscriber: tomekj2ee.

Very intresting update. Now we can search in projects code, as i see? Thanks for this supir dupir project guys)

Very intresting update. Now we can search in projects code, as i see? Thanks for this supir dupir project guys)

For clarity, this has not changed or affected how search works in codebases. The Ferret engine does not index source code.

Phabricator has supported codebase search within Git and Mercurial repositories for many years. T7472 discusses codebase search across multiple repositories and SVN support.

Running the reindex led to a couple crash looping tasks on my end:

@epriestley
After i upgraded my application to the newest version,which swap the search engine to ‘Ferret’,I reindexed several times,but I can not get any result.These is my version information:
phabricator33756bcf1d70ea5579dff1ab276bbe660d10494c (Tue, Oct 3) (branched from f9110b87abf337dd1e7714d755775e53cffd4db9 on origin)arcanist0a7f403333fe9082b39bd007b9d5f9e765c8b9ce (Tue, Oct 3) (branched from c804c5026011f27614a7bbdb2bb32cab590d68ca on origin)phutilb400c6b04bb247a3e0f1941390bc450f36ac2ccd (Tue, Oct 3) (branched from 9f9c33797a3ebbf1c4dcaa474a0c4e0b32d5396a on origin)diff2.8.1 at /usr/bin/diffgit1.7.1 at /usr/bin/githgNot Availablepygmentize2.0.2 at /usr/bin/pygmentizesvn1.6.11 at /usr/bin/svn

Catch.jpg (591×1 px, 153 KB)

epriestley claimed this task.

I think everything we know about has been resolved.