Page MenuHomePhabricator

Fix an issue with selecting the right stemmed ngrams with Ferret engine queries

Authored by epriestley on Sep 12 2017, 2:48 PM.



Ref T12819. In D18581, I corrected one bug (ngram selection for terms) but introduced a minor new bug. We now pass ' query ' (term corpus with boundary spaces) to the stemmer, but it bails out on this since English words don't start with spaces.

Trim these extra boundary spaces off before invoking the stemmer.

The practical effect of this is that searching for non-stem variations of a word ("detection") now finds stemmed variations again ("detect"). Prior to fixing this bug, the stem could find longer variations but not the other way around.

Test Plan

Searched for "detection", found results matching "detect" after patch (and saw same results for "detect" and "detection").

Diff Detail

rP Phabricator
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

epriestley created this revision.Sep 12 2017, 2:48 PM
chad accepted this revision.Sep 12 2017, 5:59 PM
This revision is now accepted and ready to land.Sep 12 2017, 5:59 PM
This revision was automatically updated to reflect the committed changes.