Page MenuHomePhabricator

Remove broken and unfixable "prefix" ngram behavior
ClosedPublic

Authored by epriestley on Apr 16 2020, 4:32 PM.
Tags
None
Referenced Files
F19507406: D21127.diff
Fri, Jan 9, 8:16 PM
F19152611: D21127.id50316.diff
Dec 11 2025, 4:02 AM
F19083269: D21127.id50316.diff
Dec 2 2025, 12:33 PM
F18985427: D21127.id50312.diff
Nov 17 2025, 1:02 PM
F18739437: D21127.id50316.diff
Oct 1 2025, 9:25 PM
F18738360: D21127.id50312.diff
Oct 1 2025, 3:18 PM
F18734234: D21127.id.diff
Sep 30 2025, 10:50 PM
F18703223: D21127.diff
Sep 28 2025, 3:18 AM
Subscribers
None

Details

Summary

Ref T13501. The older ngram code has some "prefix" behavior that tries to handle cases where a user issues a very short (one or two character) query.

This code doesn't work, presumably never worked, and can not be made to work (or, at least, I don't see a way, and am fairly sure one does not exist).

If the user searches for "xy", we can find trigrams in the form "xy*" using the index, but not in the form "*xy". The code makes a misguided effort to look for " xy", but this will only find "xy" in words that begin with "xy", like "xylophone".

For example, searching Files for "om" does not currently find "random.txt".

Remove this behavior. Without engaging the trigram index, these queries fall back to an unidexed "LIKE" table scan, but that's about the best we can do.

Test Plan

Searched for "om", hit "random.txt".

Diff Detail

Repository
rP Phabricator
Branch
search2
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 24130
Build 33226: Run Core Tests
Build 33225: arc lint + arc unit