Page MenuHomePhabricator

SearchProject
ActivePublic

Watchers (2)

  • This project does not have any watchers.
  • View All

Recent Activity

Apr 17 2020

epriestley renamed T13501: Improve search index normalization of "é" and other characters with variants or multiple representations from Ngram search for "é" has slicing and collation issues with multibyte characters and multicharacter glyphs to Improve search index normalization of "é" and other characters with variants or multiple representations.
Apr 17 2020, 1:05 PM · Search
epriestley closed T13511: Allow extensions to define new document fields (like "title:") in Ferret search as Resolved.

This is now possible.

Apr 17 2020, 12:23 PM · Search
epriestley closed T13503: Index Paste documents in Ferret as Resolved.

This is now supported.

Apr 17 2020, 12:23 PM · Search, Paste
epriestley closed T13503: Index Paste documents in Ferret, a subtask of T13511: Allow extensions to define new document fields (like "title:") in Ferret search, as Resolved.
Apr 17 2020, 12:23 PM · Search

Apr 16 2020

epriestley added a revision to T13511: Allow extensions to define new document fields (like "title:") in Ferret search: D21131: Modularize Ferret fulltext functions.
Apr 16 2020, 8:39 PM · Search
epriestley added a comment to T13511: Allow extensions to define new document fields (like "title:") in Ferret search.

These field functions have somewhat-weird scopes/context.

Apr 16 2020, 6:08 PM · Search
epriestley added a revision to T13511: Allow extensions to define new document fields (like "title:") in Ferret search: D21130: Remove Ferret function aliases and overrides.
Apr 16 2020, 5:31 PM · Search
epriestley added a revision to T13501: Improve search index normalization of "é" and other characters with variants or multiple representations: D21128: Combine the two different ngram-splitting algorithms into a single engine.
Apr 16 2020, 4:38 PM · Search
epriestley added a revision to T13501: Improve search index normalization of "é" and other characters with variants or multiple representations: D21127: Remove broken and unfixable "prefix" ngram behavior.
Apr 16 2020, 4:32 PM · Search
epriestley added a revision to T13511: Allow extensions to define new document fields (like "title:") in Ferret search: D21126: Remove unused "getAllFunctionFields()" from Ferret.
Apr 16 2020, 3:05 PM · Search
epriestley added a parent task for T13503: Index Paste documents in Ferret: T13511: Allow extensions to define new document fields (like "title:") in Ferret search.
Apr 16 2020, 3:05 PM · Search, Paste
epriestley added a parent task for T13501: Improve search index normalization of "é" and other characters with variants or multiple representations: T13511: Allow extensions to define new document fields (like "title:") in Ferret search.
Apr 16 2020, 3:05 PM · Search
epriestley added subtasks for T13511: Allow extensions to define new document fields (like "title:") in Ferret search: T13509: Support "field present" and "field absent" operators in Ferret, T13503: Index Paste documents in Ferret, T13501: Improve search index normalization of "é" and other characters with variants or multiple representations.
Apr 16 2020, 3:05 PM · Search
epriestley added a parent task for T13509: Support "field present" and "field absent" operators in Ferret: T13511: Allow extensions to define new document fields (like "title:") in Ferret search.
Apr 16 2020, 3:05 PM · Search
epriestley triaged T13511: Allow extensions to define new document fields (like "title:") in Ferret search as Normal priority.
Apr 16 2020, 3:04 PM · Search
epriestley added a comment to T13501: Improve search index normalization of "é" and other characters with variants or multiple representations.

Getting through the ngram index alone isn't good enough, because LIKE operators against utf8mb4_unicode_ci treat combining accents as separate characters:

Apr 16 2020, 2:59 PM · Search
epriestley added a comment to T13501: Improve search index normalization of "é" and other characters with variants or multiple representations.

Normalizer requires intl which I'm hesitant to add a dependency on.

Apr 16 2020, 2:00 PM · Search

Apr 14 2020

epriestley closed T13509: Support "field present" and "field absent" operators in Ferret as Resolved.

This appears to be working properly, now.

Apr 14 2020, 6:19 PM · Search
epriestley added a comment to T13509: Support "field present" and "field absent" operators in Ferret.

Query parsing of certain unusual or ambiguous inputs has changed slightly.

Apr 14 2020, 5:31 PM · Search
epriestley added a revision to T13509: Support "field present" and "field absent" operators in Ferret: D21112: Document the "field present" and "field absent" operators in Ferret.
Apr 14 2020, 5:31 PM · Search
epriestley added a revision to T13509: Support "field present" and "field absent" operators in Ferret: D21111: Make the Ferret query compiler keep functions sticky across non-initial quoted tokens.
Apr 14 2020, 5:23 PM · Search
epriestley added a revision to T13509: Support "field present" and "field absent" operators in Ferret: D21110: Implement the "present" and "absent" operators in the Ferret execution engine.
Apr 14 2020, 5:22 PM · Search
epriestley added a revision to T13509: Support "field present" and "field absent" operators in Ferret: D21109: Tighten query compiler rules around spaces inside and after operators.
Apr 14 2020, 5:18 PM · Search
epriestley added a revision to T13509: Support "field present" and "field absent" operators in Ferret: D21108: Make Ferret query functions sticky only if their values are not quoted.
Apr 14 2020, 5:03 PM · Search
epriestley added a revision to T13509: Support "field present" and "field absent" operators in Ferret: D21107: Add "absent" and "present" field operators to the Ferret query compiler.
Apr 14 2020, 4:56 PM · Search
epriestley added a revision to T13509: Support "field present" and "field absent" operators in Ferret: D21106: Tighten Ferret query parsing of empty tokens and empty functions.
Apr 14 2020, 4:52 PM · Search
epriestley triaged T13509: Support "field present" and "field absent" operators in Ferret as Low priority.
Apr 14 2020, 4:48 PM · Search

Mar 20 2020

epriestley triaged T13503: Index Paste documents in Ferret as Low priority.
Mar 20 2020, 7:12 PM · Search, Paste

Mar 9 2020

epriestley added a comment to T13501: Improve search index normalization of "é" and other characters with variants or multiple representations.

We also have two separate pieces of ngram extraction code:

Mar 9 2020, 5:43 PM · Search
epriestley added a comment to T13501: Improve search index normalization of "é" and other characters with variants or multiple representations.

For now, I'm going to change the ngram slicing to be character-oriented. This should never be worse than the current behavior, and moves us closer to effective normalization.

Mar 9 2020, 5:38 PM · Search
epriestley added a comment to T13501: Improve search index normalization of "é" and other characters with variants or multiple representations.

This appears to be the unicode normalization chart:

Mar 9 2020, 5:27 PM · Search
epriestley triaged T13501: Improve search index normalization of "é" and other characters with variants or multiple representations as Low priority.
Mar 9 2020, 5:16 PM · Search

Jan 14 2020

epriestley closed T13472: Porter stemmer library uses obsolete array access syntax which raises warning under PHP 7.4 as Resolved by committing rPdb6b4ca480ad: Update deprecated array access syntax in Porter stemmer.
Jan 14 2020, 8:11 PM · Search
epriestley added a revision to T13472: Porter stemmer library uses obsolete array access syntax which raises warning under PHP 7.4: D20941: Update deprecated array access syntax in Porter stemmer.
Jan 14 2020, 8:04 PM · Search
epriestley added a comment to T13472: Porter stemmer library uses obsolete array access syntax which raises warning under PHP 7.4.

I thought this was some kind of complicated mess with the regex on line 420, but it's actually an issue with this:

Jan 14 2020, 8:03 PM · Search
epriestley added a revision to T13472: Porter stemmer library uses obsolete array access syntax which raises warning under PHP 7.4: D20940: Move search query compiler / stemmer classes out of libphutil.
Jan 14 2020, 7:48 PM · Search
epriestley added a revision to T13472: Porter stemmer library uses obsolete array access syntax which raises warning under PHP 7.4: D20939: Move search query parser/compiler classes to Phabricator.
Jan 14 2020, 7:41 PM · Search

Jan 13 2020

epriestley triaged T13472: Porter stemmer library uses obsolete array access syntax which raises warning under PHP 7.4 as Wishlist priority.
Jan 13 2020, 4:49 PM · Search

Sep 9 2019

epriestley closed T13412: Searching for the install URI with no trailing slash fatals as Resolved by committing rPaaaea5759133: Fix fatal during redirection safety check for searching for Phabricator base….
Sep 9 2019, 7:45 PM · Search
epriestley added a revision to T13412: Searching for the install URI with no trailing slash fatals: D20794: Fix fatal during redirection safety check for searching for Phabricator base-uri with no trailing slash.
Sep 9 2019, 7:30 PM · Search
epriestley triaged T13412: Searching for the install URI with no trailing slash fatals as Low priority.
Sep 9 2019, 5:10 PM · Search

Jul 18 2019

epriestley closed T13345: Ferret does not match documents with no title as Resolved by committing rPcb4add311649: In Ferret, allow documents with no title to match query terms by using LEFT….
Jul 18 2019, 5:37 PM · Search
amckinley updated the task description for T13345: Ferret does not match documents with no title.
Jul 18 2019, 5:26 PM · Search
epriestley added a revision to T13345: Ferret does not match documents with no title: D20660: In Ferret, allow documents with no title to match query terms by using LEFT JOIN on the "title" ranking field.
Jul 18 2019, 5:22 PM · Search
epriestley triaged T13345: Ferret does not match documents with no title as Low priority.
Jul 18 2019, 5:16 PM · Search

Mar 25 2019

epriestley closed T13091: Ferret "Relevance" order does not always have all the columns it needs available as Resolved.
Mar 25 2019, 6:58 PM · Search

Mar 19 2019

epriestley added a comment to T13091: Ferret "Relevance" order does not always have all the columns it needs available.

Also, what is "By Relevance" ?

Mar 19 2019, 6:34 PM · Search
epriestley added a revision to T13091: Ferret "Relevance" order does not always have all the columns it needs available: D20298: When paging by Ferret "rank", page using "HAVING rank > ...", not "WHERE rank > ...".
Mar 19 2019, 6:24 PM · Search

Mar 18 2019

epriestley added a revision to T13091: Ferret "Relevance" order does not always have all the columns it needs available: D20297: Select Ferret fulltext columns in results so fulltext queries work under UNION.
Mar 18 2019, 11:07 PM · Search
epriestley added a revision to T13091: Ferret "Relevance" order does not always have all the columns it needs available: D20296: Skip Ferret fulltext columns in "ORDER BY" if there's no fulltext query.
Mar 18 2019, 10:52 PM · Search