Separate sever-side typeahead queries into "prefix" and "content" phases


Separate sever-side typeahead queries into "prefix" and "content" phases

Ref T8510. When users type "platypus" into a typeahead, they want "Platypus Playground" to be a higher-ranked match than "AAA Platypus", even though the latter is alphabetically first.

Specifically, the rule is: results which match the query as a prefix of the result text should rank above results which do not.

I believe we now always get this right on the client side. However, WMF has at least one case (described in T8510) where we do not get it right on the server side, and thus the user sees the wrong result.

The remaining issue is that if "platypus" matches more than 100 results, the result "Platypus Playground" may not appear in the result set at all, beacuse there are 100 copies of "AAA Platypus 1", "AAA Platypus 2", etc., first. So even though the client will apply the correct sort, it doesn't have the result the user wants and can't show it to them.

To fix this, split the server-side query into two phases:

  • In the first phase, the "prefix" phase, we find results that start with "platypus".
  • In the second phase, the "content" phase, we find results that contain "platypus" anywhere.

We skip the "prefix" phase if the user has not typed a query (for example, in the browse view).

Test Plan:
This is a lot of stuff, but the new ranking here puts projects which start with "w" at the top of the list. Lower down the list, you can see some projects which contain "w" but do not appear at the top (like "Serious Work").

Screen Shot 2016-11-10 at 8.35.27 AM.png (1×2 px, 797 KB)

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T8510

Differential Revision: https://secure.phabricator.com/D16838