Ref T6656 use elasticsearch to find duplicates
Details
- Reviewers
- None
- Group Reviewers
Blessed Reviewers - Maniphest Tasks
- T4828: Suggest/propose possible duplicates when creating a new task
use ./bin/search find_duplicates to try without any index changes.
Then do a ./bin/search find_duplicates --installmapping to change the mapping.
Then all tasks need to be reindexed (./bin/search index --type TASK) and then try
./bin/search find_duplicates --analyzed and compare with the initial result.
Language analyzed is hardcoded to english for now.
Diff Detail
- Repository
- rP Phabricator
- Branch
- deduplicate
- Lint
Lint Warnings - Unit
Tests Passed - Build Status
Buildable 3147 Build 3153: [Placeholder Plan] Wait for 30 Seconds
Event Timeline
This should probably be rebased on top of the work in D10955: Properly create Elasticsearch index.
This diff is not really meant to be merged. (And i'm not sure if can even create a diff on top of another not yet landed diff?)
But you're right that if we want the numbers to be correct you should apply the patch from D10955 and then this one on top
and then only run: ./bin/search find_duplicates
I'll just remove the mapping stuff from this diff.