Page MenuHomePhabricator

Improve search stemmer performance for large inputs
ClosedPublic

Authored by epriestley on Sep 26 2017, 2:18 AM.
Tags
None
Referenced Files
F14910921: D18648.id44767.diff
Tue, Feb 11, 3:10 PM
F14910391: D18648.diff
Tue, Feb 11, 10:35 AM
Unknown Object (File)
Wed, Jan 29, 5:25 PM
Unknown Object (File)
Tue, Jan 28, 7:37 AM
Unknown Object (File)
Wed, Jan 22, 11:01 AM
Unknown Object (File)
Tue, Jan 21, 4:02 PM
Unknown Object (File)
Tue, Jan 21, 3:09 PM
Unknown Object (File)
Tue, Jan 21, 12:08 PM
Subscribers
None

Details

Summary

Ref T12974. See PHI87. As in D18647, we can improve the performance of some UTF8 operations here.

Instead of calling phutil_utf8_strtolower() on each token separately, call it once on the entire input up front. This has the same effect.

Test Plan

Diff Detail

Repository
rPHU libphutil
Branch
stemmer
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 18545
Build 24984: Run Core Tests
Build 24983: arc lint + arc unit

Event Timeline

This revision is now accepted and ready to land.Sep 27 2017, 5:15 PM
This revision was automatically updated to reflect the committed changes.