Page MenuHomePhabricator

Allow extensions to define new document fields (like "title:") in Ferret search
Closed, ResolvedPublic

Description

See PHI1693. Previously, see T13509 for "field present" and "field absent" operators.

Currently, extensions can not define new custom Ferret fields like "animal-noises:moo". They can get these fields into the Ferret index and the content can be searched for, but they can't provide field functions and can't make animal-noises:~ match "any animal which makes a noise".

There are also a couple of adjacent upstream tasks (T13501, T13503) which likely have a lot of testing overlap.

Event Timeline

epriestley triaged this task as Normal priority.Apr 16 2020, 3:04 PM
epriestley created this task.

These field functions have somewhat-weird scopes/context.

I think it makes sense that searching for "animal-noises:moo" in global search should always work, even if you don't specifically restrict the result set to objects types that actually have this field. The actual execution engine will get the right result, broadly (objects which do not have this field will not appear in the result set), although you could debate semantics around whether animal-noises:- ("field is absent on this object") should match objects which can not ever have a field value or not. However, the engines will actually issue queries to arrive at this empty result. It would be cleaner to skip engines which don't support the function (and I think animal-noises:- matching only fields which could possibly have a value is a more intuitive interpretation).

This implies that fields might have a function like supportsObject($object) { return ($object instanceof Whatever); }, but this is easiest if each field has a class (TitleField, BodyField, etc).

I think each field should definitely not require a class, since the option should at least exist for configuration-defined custom fields to emit fulltext search functions some day. If you define "Animal Noises" in JSON somewhere, a pathway should exist toward generating the corresponding Ferret function at runtime.

Also, it seems like it should be permissible for two different applications to support the same field. That is, if you define "Animal Noises" in Differential and later decide it's such a good field you want to support it in Maniphest too, you should be able to define a mapping so that "animal-noises:moo" finds tasks and revisions which make that noise.

So where does the logic for figuring out which objects support which fields live?

One additional issue: in FerretEngine, we can only build a template object via $this->newSearchEngine()->newQuery()->newResultObject(), which is fairly sketchy, although PhabricatorSearchEngineExtension (which invokes the FerretEngine during search) has an $object in context, as does PhabricatorFerretFulltextStorageEngine::executeSearch().

I don't see much of a way out of this without support for defining fields through either subclassing or configuration.