Page MenuHomePhabricator

try to find duplicates using an analyzed elasticsearch field
AbandonedPublic

Authored by fabe on Nov 28 2014, 1:03 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Feb 3, 1:25 AM
Unknown Object (File)
Jan 20 2024, 4:43 PM
Unknown Object (File)
Jan 17 2024, 6:33 AM
Unknown Object (File)
Jan 16 2024, 5:49 PM
Unknown Object (File)
Jan 13 2024, 8:41 PM
Unknown Object (File)
Jan 12 2024, 5:35 PM
Unknown Object (File)
Jan 10 2024, 3:57 PM
Unknown Object (File)
Jan 10 2024, 3:57 PM

Details

Summary

Ref T6656 use elasticsearch to find duplicates

Test Plan

use ./bin/search find_duplicates to try without any index changes.
Then do a ./bin/search find_duplicates --installmapping to change the mapping.
Then all tasks need to be reindexed (./bin/search index --type TASK) and then try
./bin/search find_duplicates --analyzed and compare with the initial result.
Language analyzed is hardcoded to english for now.

Event Timeline

fabe retitled this revision from to try to find duplicates using an analyzed elasticsearch field.
fabe updated this object.
fabe edited the test plan for this revision. (Show Details)

This diff is not really meant to be merged. (And i'm not sure if can even create a diff on top of another not yet landed diff?)
But you're right that if we want the numbers to be correct you should apply the patch from D10955 and then this one on top
and then only run: ./bin/search find_duplicates
I'll just remove the mapping stuff from this diff.

Elasticsearch mapping is part of head now. So the rest is quite obsolete.