Ref PHI1292. Enable fulltext searchs in paste. Maybe this should only index a snippet instead of the entire content?
Also updates table names in PhabricatorPasteQuery.
Differential D20650
Add Ferret support to Paste epriestley on Jul 15 2019, 7:42 PM. Authored by Tags None Referenced Files
Details
Ref PHI1292. Enable fulltext searchs in paste. Maybe this should only index a snippet instead of the entire content? Also updates table names in PhabricatorPasteQuery. Created some pastes, indexed them, searched for them.
Diff Detail
Event Timeline
Comment Actions I'm a little uneasy about indexing the actual content, since I worry this will lead to a tragic event like "we learn that many installs routinely send 1GB logfiles consisting mostly of /dev/urandom output into Paste". Ferret seems to be performing better than MyISAM/InnoDB FULLTEXT did, but I'm not confident it will stand up to use cases like this (see also T7472). Even if we try to adopt Ferret for this "codebase search" use case in the future, I'd probably like to separate the index and/or apply special rules around discarding documents that are too large or, uh, "too random" or whatever, and have this failure be non-silent ("Some documents matching all query parameters other than your fulltext query terms are really dumb and are not indexed by the fulltext engine, so they might or might not match your query. Click here for a list." or whatever, I guess, although this isn't great). An especially cheap way to deal with this for now would be to just not index the content. This might be a bit confusing/surprising, but even a title-only search is better than nothing, so I think it's still a step forward. Let me look at the null title thing, since that seems like it's probably a bug somewhere, and indexing the English-language untitled document string is probably not ideal.
Comment Actions (Also, "Pastebin" is the name of a product/company and our database name really shouldn't be pastebin, it just is since it was a contributed patch a million years ago and database names are a pain to change. This is basically like having a database named phabricator_yelp or whatever, though.) Comment Actions
Let me also see exactly how much of a pain this is, I think we did it once and this patch definitely makes it harder. Comment Actions Okay I know this review is like a year old, but... in my defense about Paste/Pastebin, Wikipedia agrees with me that pastebin is a generic term. And it's in wiktionary which, of course, makes it true because wiktionary is never wrong, ever. 😆 Comment Actions This didn't need much:
Tested by:
|