Page MenuHomePhabricator

Combine the two different ngram-splitting algorithms into a single engine
ClosedPublic

Authored by epriestley on Apr 16 2020, 4:38 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Apr 12, 7:11 AM
Unknown Object (File)
Mon, Apr 1, 4:59 PM
Unknown Object (File)
Mon, Apr 1, 2:56 PM
Unknown Object (File)
Fri, Mar 29, 6:13 PM
Unknown Object (File)
Thu, Mar 28, 11:58 PM
Unknown Object (File)
Sun, Mar 24, 5:44 PM
Unknown Object (File)
Wed, Mar 20, 5:57 PM
Unknown Object (File)
Mar 16 2024, 5:17 PM
Subscribers
None

Details

Summary

Ref T13501. Depends on D21127. With the "prefix" behavior removed in D21127, we now have two virtually identical copies of the same code.

The newer one in Ferret is better: it slices utf8 correctly and is slightly more efficient on large inputs. Pull it out and make all callers call into it.

Test Plan
  • Grepped for all affected symbols.
  • Ran bin/search index --force ... to reindex various objects (tasks, files).
  • Searched for things in the UI.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

This revision was not accepted when it landed; it landed in state Needs Review.Apr 16 2020, 4:45 PM
This revision was automatically updated to reflect the committed changes.