Page MenuHomePhabricator

Don't apply offset to elasticsearch query
AbandonedPublic

Authored by 20after4 on Apr 4 2017, 5:15 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, May 26, 3:02 PM
Unknown Object (File)
Wed, May 25, 6:54 AM
Unknown Object (File)
Apr 10 2017, 7:48 AM
Unknown Object (File)
Apr 10 2017, 7:47 AM
Unknown Object (File)
Apr 10 2017, 7:47 AM
Unknown Object (File)
Apr 8 2017, 11:40 PM
Unknown Object (File)
Apr 4 2017, 9:05 PM
Unknown Object (File)
Apr 4 2017, 8:38 PM
Subscribers

Details

Summary

Currently we skip a lot of results when displaying offset pages
in global search results (at least under elasticsearch)...

This isn't an elegant solution as it causes us to always fetch
previous pages from elasticsearch whenever we display subsequent
pages of results. This does resolve the bug though. Fixes T8285

The same bug may exist in MySQL? I haven't tested that yet.

Test Plan

ran test search with large result set, viewed page 2
(offset=100) and saw that results were no longer skipped.

set offset to 99, saw the same result in position 1 as I saw at
the end of page 1 (with no offset).

Diff Detail

Repository
rP Phabricator
Branch
T8285 (branched from master)
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 16287
Build 21650: Run Core Tests
Build 21649: arc lint + arc unit

Event Timeline

track upstream branch so arcanist can push to staging.

This is probably a bug with the wrong offset/limit being passed to this layer, or the results being interpreted incorrectly? It seems like this behavior should be considered correct, and we should fix whatever else is broken.

With the HEAD behavior we return 100 results for offset = 700, limit = 100 but with the proposed behavior we return 800, which is a technical argument for fixing this somewhere above the engine in the stack.

It's also confusing to engine implementors since anyone reasonably writing a new engine would expect offset + limit to work like the HEAD behavior does. If we can't make them work like that, we should make the caller always pass 0 as the offset and (offset + limit) as the limit so the values at least make sense at this level, even if they aren't as efficient as they could be.

The problem is that we do pagination at a higher layer and the cursor skips 100 results when the offset is set to 100. So if we also apply an offset to elastic, then we skip 100 results in two places, thus skipping to the 200th result.

With the HEAD behavior we return 100 results for offset = 700, limit = 100 but with the proposed behavior we return 800, which is a technical argument for fixing this somewhere above the engine in the stack.

I totally agree, this needs to be fixed elsewhere, but it looks like a major pain due to the way the application does paging.

Just to see what would happen, I tried returning 100 dummy results + the real results. That didn't seem to satisfy the pager, so perhaps I'm wrong about what's actually happening. I spent quite a bit of time tracing through PhabricatorPolicyAwareQuery and still didn't ascertain exactly what is going on, however, it seems like it's skipping the results in PhabricatorPolicyAwareQuery::execute() right around line 211

I think D17667 will resolve this properly without changes, let me know if you're still seeing issues after picking up that change?

This revision now requires changes to proceed.Apr 12 2017, 9:59 PM

Yeah I think we are good now.