Page MenuHomePhabricator

Provide Search in Conpherence
Closed, ResolvedPublic

Description

Users should be able to search Conpherence:

  • It should be possible to search across all threads you have permission to see (e.g., "I know I remember someone talking about Redis, but don't remember which thread it was in -- let me go find that, since I think it was relevant to some task at hand").
  • It should be possible to search within a thread.

An index is in place, although it isn't proven yet. This is probably mostly blocked by figuring out how the UI works.

  • How do users get to the "search all threads" view? How are results shown?
  • How do users access search from within a thread?
    • How are results shown?
    • While results are shown, what does the rest of the UI do? Can the user still chat? What if a new message comes in?
  • How do users access search from within the chat column?
    • Can they?
    • How are results shown?
    • What does the rest of the UI do while results are shown?

The answers to these interaction questions are probably subtle and hard to implement. We should a dirty implementation first, to avoid having to answer any of those questions:

  • Make a /conpherence/search/ page which just has a text input on it. No way to get there from the UI.
  • When you type stuff into the text input, it searches all threads.
  • Results are displayed on the page in whatever janky mess is easiest.
  • This will prove that the index works.

Then, we can:

  • Let the page scope search to one thread (thread ID in the URI or something).
  • Link to it from somewhere in the main Conpherence UI (shove it in a menu or something, no integration).
  • Work on making the results look nicer if they're hard to read or super garbagey or whatever.

Once the thing actually works and is useful, we can figure out how to merge it into the UI and how all the interactions with JS and events will work, since that stuff should be better defined by then.


Earlier

Some mocks; these are not specs.

conpherence_unbeta_search_v1.png (1×1 px, 445 KB)

conpherence_unbeta_searched_v1.png (1×1 px, 421 KB)

Event Timeline

chad triaged this task as Normal priority.
chad added a project: Conpherence.
chad added subscribers: chad, epriestley.

I think this would be done via the general search infrastructure? I'm not sure how permissions comes into play there exactly, but since Conpherence is pretty standard I can't imagine it would be too tricky.

My understanding on the general search infrastructure is what we'd use right now -- mysql queries -- may suffer from a few robustness issues with respect to search terms and character set. Ergo, for Phacility we'd likely need a specialized search tier to make this functionality less buggy.

How am I doing? :D

T3165#2 is pretty much on point, I think. We should be able to put this in the general index and do privacy filtering on the way out. We already support ElasticSearch and would probably want to use that instead of MyISAM fulltext for SaaS installs, although I don't think it's critical -- MyISAM fulltext is, like, "mostly OK" for latin text.

I think the open technical stuff here is:

  1. We might need to make individual messages the actual things that get indexed? I'm not sure if it's good enough to search for something and just get the thread back vs search for something and get individual messages back. My naive expectation is messages. This might involve a little bit of legwork (giving them handles and a Query, making them implement PolicyInterface).
  2. It might be worthwhile to create a non-default index for these and chatlogs. I think the expectation when you search with general search is that you won't get chatlog or private message hits? We can do this filtering on the way out instead (get everything and throw away chatlog/messages), but I worry a little bit that these object types may be vastly more numerous than "real" objects (e.g., we have 60k chatlog entries and only 3k tasks) and tend to overwhelm them and produce weird-performing result pages (which load 30 pages of results in order to get 50 actual visible results).

Overall I don't think this is super important. #1 above is straightforward; #2 above requires a little research/design to it (whatever we come up with has to translate to ElasticSearch too) but I don't think it's too complicated. We have a lot of leeway here in general because we're free to destroy all the existing indexes and just reindex them (and already have user-friendly scripts to do this).

My assumption here is we'd provide a basic per message thread search, and return only results in that thread. No hooks for a global search.

btrahan added a subscriber: btrahan.

I have no plans to work on this anytime soon. I've looked at it once or twice for a few hours and I think splitting out a separate index would be quite hard. (Perhaps its easy with some elucidation.)

epriestley lowered the priority of this task from Normal to Wishlist.Sep 14 2014, 3:08 PM

I haven't seen user requests for this either, and have only wished we had it once or twice (and email search worked fine). Probably makes sense to hold until after T5364.

This feature is one of the reasons Slack is doing as well as it is, at least from what I've seen. The benefit specifically being when people are added to Conpherences, and don't have the benefit of history in their inbox, they can get up to speed or find conversation points as if they've been there all along.

But yeah, I haven't seen many people ask for it, I think you'd have to be a larger install that like, hires people regularly, to have any regular benefit.

Oh, right, the policy filtering bit was the big issue. I do think that's a real concern. Specifically, the concern is:

  • You search for "task".
  • It matches 20,000 messages you don't have permission to see (messages are numerous and often restricted) and 5 tasks you do have permission to see.
  • We spend 9000 years policy checking all of them.

I think we can just do a dedicated index for this pretty easily. I can throw up a diff since all the fulltext junk is a little magical and there aren't many examples.

I want to either merge Chatlog or remove it (T6875) so one-off'ing this seems fine.

chad updated the task description. (Show Details)

Peace, I'm out.

D11234 should give us a reasonable implementation. Features include:

  • Search all threads for text, or one or more specific threads.
  • Relatively cheap way to get context (previous/next messages) for rendering (seems useful?)
  • Not a hugely expensive mess that will destroy the main search index.

Limitations include:

  • No stemming until T6740.
  • Probably no CJK support (T2632).
  • Pretty one-off-ish, but not too terrible?
chad changed the visibility from "All Users" to "Public (No Login Required)".
jason.chen moved this task from Future to v3 on the Conpherence board.

I think you ended up with a few items you were interested in, but maybe this still makes the cut for your plate?

exp10r3r moved this task from Future to v3 on the Conpherence board.
epriestley edited projects, added Conpherence (v4); removed Conpherence.

Seems like all the pieces are here except UI, @epriestley ?

At least search across all threads I can see works, so It should be all UI to add it per room. I'll take a stab at this next unless there is something other than UI blocking.

Should basically just be UI, yeah.