Page MenuHomePhabricator

Make Ferret indexing more robust (UTF8, exception handling)
ClosedPublic

Authored by epriestley on Aug 28 2017, 10:41 PM.
Tags
None
Referenced Files
F18735097: D18487.id44410.diff
Tue, Sep 30, 11:34 PM
F18629331: D18487.diff
Tue, Sep 16, 8:32 AM
F18509025: D18487.id.diff
Sep 5 2025, 3:10 AM
F18501466: D18487.diff
Sep 4 2025, 9:46 PM
F18348690: D18487.diff
Aug 26 2025, 5:15 PM
F18095319: D18487.id.diff
Aug 7 2025, 11:05 PM
F18095294: D18487.id44410.diff
Aug 7 2025, 10:59 PM
F18092430: D18487.id44410.diff
Aug 7 2025, 10:19 AM
Subscribers
None

Details

Summary

Ref T12819. Two minor improvements from live data:

  • Tokenize in a UTF8-aware way.
  • When one document fails to index, kill the transaction explicitly (rather than leaving it hanging) so we don't cause other failures later.
Test Plan

Created some UTF8 documents locally, indexed them, got clean results.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

This revision is now accepted and ready to land.Aug 28 2017, 10:49 PM
This revision was automatically updated to reflect the committed changes.