Page MenuHomePhabricator

Put some kind of stemmer on the MySQL search index
Closed, ResolvedPublic

Description

Concretely, I just searched for "tokens" and missed a hit on "token", which is trivially resolvable with stemming. We can dump Porter or whatever the state of the art is in pretty easily.

D10955 is pursuing proper support for a real index under ElasticSearch, but I don't really want to deal with ElasticSearch on this install or in the Phacility cluster until much later: the cost of adding a new type of service tier is likely far higher than the cost of dropping a stemmer into the indexer.

Event Timeline

epriestley raised the priority of this task from to Low.
epriestley updated the task description. (Show Details)
epriestley added a project: Search.
epriestley added subscribers: epriestley, chad.

The stemmer should probably make an effort to preserve exact phrase search. Likely, this means special handling of quoted terms when tokenizing and stemming the query.

rfergu added a subscriber: rfergu.Mar 31 2015, 2:42 AM
ftdysa added a subscriber: ftdysa.Aug 16 2016, 1:46 PM
isfs added a subscriber: isfs.Nov 2 2016, 2:08 AM
epriestley moved this task from Backlog to v2 on the Search board.Dec 8 2016, 6:59 PM
epriestley closed this task as Resolved.Dec 8 2016, 7:04 PM
epriestley claimed this task.

Relatively little doom appears to have befallen us.