HomePhabricator

Combine the two different ngram-splitting algorithms into a single engine

Description

Combine the two different ngram-splitting algorithms into a single engine

Summary:
Ref T13501. Depends on D21127. With the "prefix" behavior removed in D21127, we now have two virtually identical copies of the same code.

The newer one in Ferret is better: it slices utf8 correctly and is slightly more efficient on large inputs. Pull it out and make all callers call into it.

Test Plan:

  • Grepped for all affected symbols.
  • Ran bin/search index --force ... to reindex various objects (tasks, files).
  • Searched for things in the UI.

Maniphest Tasks: T13501

Differential Revision: https://secure.phabricator.com/D21128