Page MenuHomePhabricator

Compile `_...` search tokens as substring searches
Closed, ResolvedPublic

Description

See PHI2017. An install reports that a user searching for __FILE__ didn't get the results they expect.

The "correct" search here is a substring search, ~"__FILE__", which works properly.

A search for __FILE__ is stemmed to be equivalent to a search for file ("probably not what the user wanted") because we'd like a search for file to find __FILE__ ("almost certainly what the user wanted"), and stemming is symmetric.

Fixing this in the stemmer seems difficult, since we'd need to break symmetry. For example, we could imagine an "indexing stemmer" and a "searching stemmer":

indexing-stemmer("__FILE__") -> "file", "__file__"
searching-stemmer("__FILE__") -> "__file__"
searching-stemmer("file") -> "file"

This is a great deal of complexity to add to support this fairly narrow use case, and I think the complexity may get much worse when trying to do quoted matches.

A narrower fix is to always treat __X__ as a substring search if it doesn't appear as an argument to any other search function. This seems reasonable, since a user searching for __X__ almost certainly means ~"__X__".

Related Objects

Event Timeline

epriestley triaged this task as Wishlist priority.Mar 10 2021, 7:35 PM
epriestley created this task.

I think we can be slightly more general about this, and assume any token beginning with _ is substring search. This covers __FILE__, __construct, etc. Users almost certainly intend these to be substring searches.

Seems like it works:

Screen Shot 2021-03-10 at 11.59.08 AM.png (927×1 px, 228 KB)

epriestley renamed this task from Compile `__X__` search tokens as substring searches to Compile `_...` search tokens as substring searches.Mar 10 2021, 8:01 PM