Page MenuHomePhabricator

Ferret "Relevance" order does not always have all the columns it needs available
Closed, ResolvedPublic

Description

In this instance - any Differential query that has "Order By: Relevance" - throws error #1054: Unknown column 'ft_doc.epochModified' in 'order clause'
e.g. https://secure.phabricator.com/differential/query/.orc_HwcEyo_/#R

(From https://discourse.phabricator-community.org/t/unhandled-exception-if-using-query-with-order-by-relevance/1169)

Also, what is "By Relevance" ?

Event Timeline

avivey updated the task description. (Show Details)

When you use the "Query" field to activate Ferret fulltext search, a "Relevance" score is computed internally. Roughly, today: documents are more relevant if your terms appear in the title than if they appear only in other fields; and stemmed terms are more relevant if the exact term you searched for appears in the raw document than if the stemmed version appears in the normalized version of the document.

If you don't specify a fulltext "Query", relevance is the same as "Date Updated (Latest First)".

I think this is a couple of different issues:

  • When you don't specify a "Query" and order by "Relevance" in any application, we build a synthetic _ft_rank field but do not join ft_doc and do not build a synthetic epochModified/epochCreated field. This produces the Unknown column 'ft_doc.epochModified' in 'order clause' error. It's probably cleanest to join ft_doc if the order vector includes the fulltext date columns.
  • When you do specify a "Query" and order by "Relevance" in Differential, and have "Responsible Users" nonempty, we (SELECT ...) UNION DISTINCT (SELECT ...) ORDER BY ... and order the results by a column which was not selected. I think we can fix this by making the epochCreated and epochModified handling more similar to the _ft_rank handling: select them with _ft_* aliases, then remove them from the returned results.

The second issue causes the error #1054: Unknown column 'epochModified' in 'order clause' (note no ft_doc) which isn't exactly the same as the first issue.

epriestley renamed this task from Exception in search in Differential to Ferret "Relevance" order does not always have all the columns it needs available.Feb 24 2018, 3:24 PM
epriestley triaged this task as Normal priority.
epriestley added a project: Search.

Actually, I'm not entirely right in merging that task -- T13163 isn't quite the same as the other two issues here. I think they're similar, but the query text is relevant in the case of T13163. Notably:

That's a little bizarre. My guess is that we're mishandling stopwords somehow, but this one is probably a little more tractable than the other two. It's also more realistic to hit and more fundamentally useful.

(I spent some time digging into the other two, but didn't arrive at a solution I was very happy with. Since they're not especially meaningful queries I just moved on for the moment.)

Trying to reproduce this locally just hits the ft_doc.epochModified issue. I'm not immediately sure why the behavior differs between my local install and secure, but that issue probably needs to be fixed first.

Also, what is "By Relevance" ?

Broadly, "Relevance" means "fulltext query relevance" and sorts results which are better matches to your query terms above results which don't match as well. For example, if you search for "turtle", a task named "turtle" will be sorted above a task that has "turtle" in a comment but does not have "turtle" in the task name.

Note that the "Relevance" ordering isn't very meaningful when you don't specify a query, since no task has more or less relevance to an empty query than any other task. When multiple results have equal relevance, they are sorted by modified date and then by ID. In the case of "no query, sorted by relevance", all results effectively have the same relevance.

That's a little bizarre ("whatever" vs "whomever").

This appears to be an internal paging/filtering issue stemming from the WHERE/HAVING thing. I believe it resolved by D20298.


D20296 fixes this issue:

  • In Maniphest (or other applications which support fulltext search), if you don't specify a "Query", and order by "Relevance", you get an exception. We now show you a result list, although the order isn't very meaningful (see above).

D20297 fixes this issue:

  • In Differential, if you do specify a "Query", and "Bucket: Bucket by Required Action", you get an exception. We now process this query correctly.

D20298 fixes this issue:

  • In Maniphest (or other applications which support fulltext search), if you specify a "Query" that matches more than 100 results, and order by "Relevance", and click "Next Page", you get an exception. We now process this query correctly.

I think those cover everything anyone has hit here.

epriestley claimed this task.