Page MenuHomePhabricator

Suggest/propose possible duplicates when creating a new task
Open, LowPublic

Tokens
"Like" token, awarded by Lorenzo."Like" token, awarded by mattflaschen."Mountain of Wealth" token, awarded by nemobis."Orange Medal" token, awarded by Gryllida."Like" token, awarded by PhoneixS."The World Burns" token, awarded by rdpascua."Like" token, awarded by nicereddy."Dislike" token, awarded by rugabarbo."Like" token, awarded by qgil.
Assigned To
None
Authored By
aklapper, Apr 18 2014

Description

When you create a new bug report in Bugzilla, you get a list of possible duplicates based on the terms that you've entered as the *title* so we get a few less duplicates as people look through the proposals.
It might be even more useful for new users filing what is going to be probably an obvious duplicate.

Wikimedia tasks:

Event Timeline

aklapper created this task.Apr 18 2014, 9:45 AM
aklapper raised the priority of this task from to Needs Triage.
aklapper updated the task description. (Show Details)
aklapper added a project: Maniphest.
aklapper added a subscriber: aklapper.

There's some discussion of this in T397. In short:

  • We had a feature like this at Facebook.
  • But, it seemed to be nearly useless.
  • It has seemed useless to me in every other system that I've filed a bug and been given duplicates, too.

For example, T397 is a duplicate here, but uses different words for everything -- "consider" instead of "suggest/propose", "dedupe" instead of "duplicates", "bugs" instead of "task": basic title-based search would not have found it. My experience was that this was the norm.

Do you have any data about how effective this feature really is for your install?

I'm not totally opposed to pursuing this, but I worry that it doesn't actually work, or works so poorly that it's worse on the balance. I'd like to see data (or, at least, hear experiences) showing that it works well.

Particularly, the cost of a false positive (a user incorrectly believes they have found a duplicate) is higher than the cost of a false negative (a user incorrectly believes their bug is unique) since mechanically merging issues is easy but mechanically separating them is difficult. Generally, our strategy for dealing with this is currently "make merging cheap and easy".

qgil added a subscriber: qgil.Apr 22 2014, 6:38 AM

I can say that in Wikimedia's Bugzilla instance with more than 60k reports, there have been several occasions that I didn't create a bug because among the 5 possible duplicates offered by Bugzilla, one was the one I was about to file. Sorry, I don't have more data than this, but I believe it is a relevant feature when you are a plain user filing a bug that you believe you have found before anybody else (and not a core developer that has gone through the majority of reports filed).

In fact the feature is good enough for me to not search in advanced anymore. I just start creating the report trusting that Bugzilla will find the potential dupes. Sometimes this didn't work indeed, and I created a duplicated report, but in my mind the algorithm was noticeably more effective than ineffective catching dupes.

Note also that here you have a mixture of tasks and bugs. While tasks tend to have more creative and free-form writing, bugs usually refer to specific strings in the UI, and a relatively limited vocabulary of problems. I don't know the heuristics applied by Bugzilla, but perhaps t is worth looking at their code.

The usefulness of proposed duplicates really depends on the search algorithm used in the backend, and I don't know enough neither about Bugzilla's nor Phabricator's so I cannot judge if Bugzilla's duplicate search is good or bad, as I miss comparisons.
It's hard to gather data here which isn't just anecdotes unfortunately.

Slightly offtopic: In my perfect world which is 10 years away, I'd expect duplicate proposals to prefer existing tickets filed under the same project over tickets in other projects, stemming entered words (duplicat*), covering spelling differences in English variants ("colour" vs "color") and potentially using an English language thesaurus ("delete" and "remove"). This touches Natural Language Processing (NLP) and while there are tons of research papers on this topic, I still have to see a real implementation in any bugtracker. Maybe I'm just not aware of an implementation example.

qgil moved this task to Important on the Wikimedia board.Apr 30 2014, 8:08 PM
qgil added a subscriber: kouiskas.Apr 30 2014, 8:14 PM

Pasting here related feedback from @kouiskas, a former Phabricator user at DeviantArt, now working at the Wikimedia Foundation with Bugzilla:

(What Phabricator can't do?)

Suggesting tickets when people report an issue. This is actually one of the rare things that WMF's bugzilla instance does better. Phabricator doesn't offer similar tickets when you type a new one, which I guess can result into more dupes.

chad triaged this task as Low priority.Jun 9 2014, 3:52 PM

I find the "possible duplicates" search very handy when using Wikimedia's bugzilla. My experience and usual workflow matches what @qgil describes.

For a "quick partial fix" solution, I'd tentatively suggest adding a link next to the Title field, which would open a new tab containing a search for the currently entered string. That would prevent me having to type the same string twice.

robertkraig added a subscriber: robertkraig.EditedNov 22 2014, 10:52 PM

I second this. At my company we are running into lots of dups. We used to use trac, and our CEO uses crap like this to tell us that phab is not as good as trac. I don't like having to argue this point. I think having a auto-complete would be a great thing to have. but instead of below like a search engine, show like a container below search which has a list similar to the one that popups like the "Edit Blocking Tasks" modal. This would probably add additional performance requirements on the server. But make it optional, and tell the administrator that they should consider setting up an elasticsearch server on the machine to use this feature.

tgr added a subscriber: tgr.Nov 25 2014, 2:03 AM

For example, T397 is a duplicate here, but uses different words for everything -- "consider" instead of "suggest/propose", "dedupe" instead of "duplicates", "bugs" instead of "task": basic title-based search would not have found it. My experience was that this was the norm.

This depends on the domain. Tasks (especially feature-oriented ones) tend to be vaguely worded. Technically oriented tasks ("make RFC 1234 compliant", "add CORS support") often have a very specific keyword, and bugs often quote an error message, so search works much better with these.

Do you have any data about how effective this feature really is for your install?

Not sure about bugzilla (personally, I found it useful for very specific terms but frustrating for more common ones, as it does not rank up results from the same component) but the similar feature for StackOverflow is extremely helpful and well-liked by its users.

Also, the value of listing similar tickets is not limited to avoiding duplicates; it can be useful for new contributors who are learning the ropes, and want to see some examples for what they are about to do (e.g. what kind of information to report for a bug, whom to CC on it, which tracking bug to apply).

qgil added a comment.Nov 25 2014, 7:30 AM

data about how effective this feature really is for your install?

No scientific data, but for what is worth, from all the feedback we got during months of pre-launch phase, we ended up listing the lack of this feature in the top position of "Missing features" in our announcement and documentation.

I'm not totally opposed to pursuing this, but I worry that it doesn't actually work, or works so poorly that it's worse on the balance. I'd like to see data (or, at least, hear experiences) showing that it works well.

I think what happens is that such feature is more useful for new / casual users (or advanced users when submitting tasks for a project they are unfamiliar with) combined with popular problems / requests. See what happens:

Task A exists. These are the options when a duplicated is being submitted:

  1. Draft Task B is suggested as a potential duplicate. The user stops the submission and goes to the existing task.
  2. Draft Task B is not suggested as duplicate. Task B is submitted indeed, and indexed. The maintainers might find it or not, might mark it as a duplicate or not.
    1. Duplicate Task C is being drafted...

You see how the filter builds itself stronger in two areas:

  1. Specific tasks where unavoidable terms are combined, i.e. an error message, an API method, a UI string...
  2. Popular tasks that generate several reports, which end up covering a wider vocabulary after some undetected cases.

Particularly, the cost of a false positive (a user incorrectly believes they have found a duplicate) is higher than the cost of a false negative (a user incorrectly believes their bug is unique) since mechanically merging issues is easy but mechanically separating them is difficult. Generally, our strategy for dealing with this is currently "make merging cheap and easy".

This feature is not about merging existing tasks, it is about preventing duplicates from being submitted at all. When you have many occasional reporters that will not search / will not find previously submitted tasks, the difference of cost is big.

I wonder whether this could be a GSoC / OPW - Facebook Open Academy type of project that we could help mentoring. Whatever algorithm Bugzilla is using, it is in its source code. There are probably other alternatives and maybe some open research available as well. The integration point is clear: user entering data during Maniphest task creation. A good scenario for an extension?

fabe added a subscriber: fabe.Nov 25 2014, 1:14 PM

We have the same problem with more casual submitters.
I could think of a fairly easy but powerful implementation of this when elasticsearch as a backend is used.
The custom scoring and basic language processing stuff is really nice and would allow for things like listing bugs within the same project before others quite easily.

However elasticsearch as a search backend for phabricator is optional and implementing this stuff only with mysql would be much harder.
I'm not sure about phabricators policy for enable features like this only with an elasticsearch backend but i'll hack something together, probably on the weekend and will try to get some data by running the "merged / closed as duplicate" tasks from the phabricator project through it to determine how many a search would have found and listed in the top5 / 10.

@fabe, https://wikimedia.phabricator.org runs the Elasticsearch backend. If you develop this as an extension, we might be able to help at least with testing and feedback. fwiw we have a playground at https://phab-01.wmflabs.org .

fabe added a comment.Nov 28 2014, 1:58 PM

The attached revision will get all tasks marked as duplicate in the maniphest db and then tries to find the task it was merged into in the elasticsearch just using the title.
By default it will use the standard analyzed field in elasticsearch as it is right now.
If you run installmapping and reindex the tasks the results should get better because it will utilize english stopwords and stemming.
Let's see what this yields. If it still find too less results we can try to combine an ngrams / edge_ngrams analyzer into it.

qgil added a comment.Apr 11 2015, 7:18 PM

Would you be able to give a rough estimate for paid prioritization of this task? To simplify things, the implementation could basically clone the algorithm used by Bugzilla, since our users were quite happy with it.

I'm not totally opposed to pursuing this, but this is mostly a product question since I don't immediately see a great way to surface it in the UI. We can just dump it at the bottom of the "Create Task" screen under the preview, but I worry it will be hard to spot and use if you have to scroll all the way down the page. @chad, do you have thoughts on this in general?

We could make it modal after you create a task (e.g., require you to confirm that it's not a duplicate of similar tasks), which I think is what PHP's bug system (or PEAR's?) does, but I think this would be far worse on the balance for almost all installs (and doesn't seem to be a request in this case).

Does anyone have a screenshot of how Bugzilla handles it? I wasn't quickly able to find the UI screen by poking around a couple of random installs.

This is probably on the order of 2-3 hours of work if we can find a product treatment that we're happy with, it's just not clear to me that this is something which really makes sense in the upstream.

qgil added a comment.Apr 11 2015, 7:46 PM

I recommend you to try this feature in Bugzilla. As far as I remember, the whole suggestion of potential duplicates is based on the title alone. The point is to save you writing a report that already exists, therefore making the check after the user has done the work would be... less good. (Probably better in terms of actual detection of duplicates, but probably resulting in less happy users thinking that you could have told them before writing the task).

Can't comment on how wikimedia handles that but I have experienced couple different approaches that may/may not work in most cases:

  1. Force user to first actually SEARCH for existing issue and confirm that shown searches match/do not match user's query. This is actually common when dealing with hosting companies ;)
  2. Suggest first to browse recently filled tickets (maybe somebody already filled task about problem you're looking for)
  3. Suffest existing tasks based on keywords actively entered in task title/description

3rd one was done with screen divided in two columns. In left one You'd fill Your description of issue and in right system suggested tickets. That one was actually very convinient, unfortunatelly I do not have access to that system now for making a screenshot.

qgil added a comment.Apr 11 2015, 7:53 PM

Yes, we give these instructions to our users. However, about 1 and 2, there just so much you can ask users to search in an instance approaching a 100k tasks, with an ok-ish search engine like Phabricator's. Even our expert users with a huge institutional memory miss a duplicate from time to time.

As a Bugzilla user well aware of good practices, I used this suggestion mechanism to actually perform quick "searches", playing with a couple of potential titles / keyword variations to see what kinds of suggestions were thrown. It was quite good at finding clear dupes of bugs that someone had reported already.

About the UI, if I remember correctly, the suggestions would just appear underneath the title in some ajax-ish box. Just like the rest of Bugzilla, not the most beautiful solution, but good to get job done.

chad added a comment.Apr 11 2015, 7:59 PM

I think this makes sense in Nuance, not Maniphest, from an upstream perspective.

Design-wise I believe that suggestions should be placed very near and in same screen "estate" (hard to find right word right now). Basically: suggestions based on title should appear near title (possibly while entering title). Suggestions based on keywords found in task text should be visible near area where You enter text - this would be hard for small screens, but 2-column layout I described about my experience was really nice and saved me from filling couple of tickets.

Even more general aproximation for design: suggestions should be seen while entering text, near place where entering text :)

Does anyone have a screenshot of how Bugzilla handles it? I wasn't quickly able to find the UI screen by poking around a couple of random installs.

Here's the sequence, for filing a bugzilla bug. (on their 4.0 test instance https://landfill.bugzilla.org/ )

  • File Screen

  • Start typing

  • It automatically starts searching

  • It provides a list of possible duplicates

qgil added a comment.Apr 11 2015, 8:08 PM
In T4828#106461, @chad wrote:

I think this makes sense in Nuance, not Maniphest, from an upstream perspective.

I can see the usefulness of this feature in Nuance, but... why would it be wrong in Maniphest? You could just implement it for both, having it disabled by default in Maniphest, if you think this reflects better the stock product.

@aklapper knows better the cadence of new duplicates we get in our Maniphest instance. In any case, our real problem is with Maniphest, and therefore our money would go to fix that problem.

chad added a comment.Apr 11 2015, 8:10 PM

I could see it in Maniphest in the meantime. I think from our side, it's not a feature most installs need, which then leaves us in this weird state of maybe it's a config item or only shows if you have more than X tasks? I think it is useful, overall.

chad added a comment.Apr 11 2015, 8:15 PM

@epriestley maybe this is just a CustomField then? People have to enable?

If we did T7805 first and got a generic ApplicationSearch endpoint out of it, I'd be open to writing this as an extension CustomField and then disavowing all knowledge of it. The results UI wouldn't be custom, but maybe that's fine. We might need to pay down some infrastructure debt to let installs put this immediately underneath the "Title" field, I think a couple of the fields are still hard-coded.

eadler added a subscriber: eadler.Apr 30 2015, 3:49 AM
Gryllida added a subscriber: Gryllida.EditedMay 5 2015, 5:03 AM

cost of a false positive is higher than the cost of a false negative

Make 'find dupes' button optional but please add it. The search box is harder to find than a new task form -- I often consciously file 80%-likely duplicates despite conscious feeling (yes, it bites, but the lack of easy to use search box bites more) that I'm wasting time of whoever triages the bugs.

srijan added a subscriber: srijan.Jul 19 2015, 12:07 PM
MZMcBride updated the task description. (Show Details)Aug 14 2016, 11:58 AM
aklapper moved this task from Important to Details on the Wikimedia board.Oct 5 2016, 1:37 PM
tgr added a comment.Feb 15 2017, 9:27 AM

Wikimedia ran a vote to find which problems are the most annoying to developers (not limited to problems with Phabricator) and this one was the most voted by far, with over one third of the participants voting on it.