Page MenuHomePhabricator

Remove history query from DiffusionRepositoryController as it is unused
ClosedPublic

Authored by cspeckmim on Sep 4 2021, 5:55 AM.
Tags
None
Referenced Files
F14077692: D21717.id51744.diff
Fri, Nov 22, 12:55 AM
F14076983: D21717.diff
Thu, Nov 21, 8:50 PM
Unknown Object (File)
Wed, Nov 20, 6:20 AM
Unknown Object (File)
Wed, Nov 20, 6:20 AM
Unknown Object (File)
Wed, Nov 20, 6:18 AM
Unknown Object (File)
Wed, Nov 20, 6:18 AM
Unknown Object (File)
Wed, Nov 20, 6:12 AM
Unknown Object (File)
Wed, Nov 20, 5:53 AM
Subscribers

Details

Summary

The history query for the repository page isn't actually used to display any content. It looks like it was previously used to display the last user which modified a file however this looks to be removed in D21404. This removes the history query from happening as well as updates DiffusionBrowseTableView to remove the parameters for passing this information in, resulting in also updating DiffusionBrowseController to no longer need to put this information together.

Refs T13666

Test Plan
  1. I removed commits from a repository on the local state.
  2. I navigated to the repository's landing page and saw that the landing page attempted to render content and only failed to load the browse files section.
  3. I navigated to the history tab and verified that it showed an exception about failing to query commit information.
  4. I restored the repository working state to function properly.
  5. I navigated to a repository's landing page and verified it loaded properly, including showing the last modified date for each file.
  6. I navigated to the Code, Branches, Tags, and History tabs to verify each tab page loaded properly.
  7. I verified on the Code tab that the last modified date for each file displayed properly.

Diff Detail

Repository
rP Phabricator
Branch
nohistory
Lint
Lint Passed
Unit
Tests Skipped
Build Status
Buildable 25539
Build 35315: Run Core Tests
Build 35314: arc lint + arc unit

Event Timeline

This construction:

foreach (new FutureIterator($futures) as $future) {
  // ...
}

...resolves futures in parallel. These constructions, or any other similar construction, resolve the futures in series:

foreach ($futures as $future) {
  $future->resolve();
}
$future_1->resolve();
$future_2->resolve();
// ... etc ...

So this change potentially imposes a significant performance penalty on the page, theoretically making it up to twice as slow.

When implemented correctly, a Future should not throw until it is resolved (at least, for any kind of "result of trying to resolve the future" error), so the Iterator loop should never throw. Some older Futures were less careful about this, but I believe major futures were pulled in line after T11968. From cursory inspection, I think the behavior of ConduitClientFuture is correct, and I can't immediately get the iterator loop to throw an exception (see T13666#256232 for my test case and observed behavior).

Upshot:

  • I can't figure out how to make the iterator loop throw an exception, and the intent is that "removing it will allow the existing error handling to kick in if a future fails to resolve to a valid value" is untrue.
    • If it can throw an exception, that's a bug and I think we should fix it instead of removing the loop.
    • If it can't throw an exception, removing it just makes the page slower (perhaps significantly).

Everything else here looks great (removing the unused future and handles, etc), but since I can't figure out a reproduction case that lines up with the description in T13666 I'm not sure if the underlying issue is me not reading properly, conflation, some other bug, etc. Can you, e.g., give me a patch which makes the iterator loop throw or maybe show me a screenshot of the failure state?

(A possible issue is that my local install is "clustered", while your test install may not be, but I don't think this should impact behavior unless I'm forgetting about something.)

I'm also happy to accept a version of this change that does everything except remove the loop if you want to simplify concerns a bit and deal with whatever's going on there separately.

This revision now requires changes to proceed.Sep 4 2021, 7:16 PM

When implemented correctly, a Future should not throw until it is resolved (at least, for any kind of "result of trying to resolve the future" error), so the Iterator loop should never throw. Some older Futures were less careful about this, but I believe major futures were pulled in line after T11968. From cursory inspection, I think the behavior of ConduitClientFuture is correct, and I can't immediately get the iterator loop to throw an exception (see T13666#256232 for my test case and observed behavior).

Does that mean the ConduitClientFuture doesn't begin execution until resolve() is called or some other internal mechanism triggered by iterating a FutureIterator? I will definitely add this back in.

I think I jumped to conclusion that the exception was coming from the iterator loop as it looked like resolving of the futures elsewhere all happened in try/catch blocks which look to appropriately handle the separate content, and the error definitely seemed to be coming from the history query and not one of the others (I'll update with better repro steps).

(A possible issue is that my local install is "clustered", while your test install may not be, but I don't think this should impact behavior unless I'm forgetting about something.)

It does look like PhabricatorRepository::newConduitFuture() might return a different type of future depending on where the repository is hosted, but I've only looked at that briefly right now.

Does that mean the ConduitClientFuture doesn't begin execution until resolve() is called or some other internal mechanism triggered by iterating a FutureIterator?

Right: creating a future doesn't start it (it doesn't make any network connections, launch any subprocesses, etc). Technically, you can start a future explicitly by calling $future->start(), but there's little/no reason to ever do so.

Even if we started futures on creation, the process is ultimately still single-threaded and we don't get any kind of I/O handling "for free" in PHP, so execution must pass into most Futures regularly to make meaningful progress on them: for example, if you just start() an HTTP future and don't return to it, it will probably get about as far as opening a network connection but not actually send any bytes over the wire, so it isn't really executing fully in parallel. Subprocess futures usually don't need as much upkeep as network futures, but they can still stall if the stderr buffer on the child process fills up and the parent process hasn't returned execution into the Future to read the buffer.

It does look like PhabricatorRepository::newConduitFuture() might return a different type of future depending on where the repository is hosted, but I've only looked at that briefly right now.

Yeah, pretty sure this is the right analysis and the root of the issue.

Restore parallel future resolution

Does that mean the ConduitClientFuture doesn't begin execution until resolve() is called or some other internal mechanism triggered by iterating a FutureIterator?

Right: creating a future doesn't start it (it doesn't make any network connections, launch any subprocesses, etc). Technically, you can start a future explicitly by calling $future->start(), but there's little/no reason to ever do so.

Even if we started futures on creation, the process is ultimately still single-threaded and we don't get any kind of I/O handling "for free" in PHP, so execution must pass into most Futures regularly to make meaningful progress on them: for example, if you just start() an HTTP future and don't return to it, it will probably get about as far as opening a network connection but not actually send any bytes over the wire, so it isn't really executing fully in parallel. Subprocess futures usually don't need as much upkeep as network futures, but they can still stall if the stderr buffer on the child process fills up and the parent process hasn't returned execution into the Future to read the buffer.

It seems obvious in hindsight especially considering that PHP is single-threaded. I made the assumption that they were started somehow, I think being used to handling futures in a Java codebase where APIs which create/return futures usually always schedule them on an executor as well.

This looks great to me now, as long as I didn't miss anything with the Future stuff and it does what it's supposed to on top of D21720 + D21721.

I'm working on testing out the issue before/after this change

cspeckmim retitled this revision from Update DiffusionRepositoryController to use appropriate error handling, remove history query to Remove history query from DiffusionRepositoryController as it is unused.Sep 4 2021, 10:41 PM
cspeckmim edited the summary of this revision. (Show Details)
cspeckmim edited the test plan for this revision. (Show Details)

Rebased onto changes (no merge needed tho)

This revision is now accepted and ready to land.Sep 4 2021, 10:42 PM

Try again but with arcanist on master

Sigh maybe update arcanist too. I ran arc lint before arc diff and it said no lints...

I tested this out and it works properly, in that there's no noticeable difference between the behavior with this change and without but including the new future updates