Page MenuHomePhabricator

ServiceProfiler integration should be a configurable part of Future, not hard-coded into subclasses (HTTPSFuture, ExecFuture)
Closed, ResolvedPublic

Description

Currently, concrete Futures (ExecFuture, HTTPSFuture, QueryFuture, etc) generally hard-code their own integration with ServiceProfiler.

It would probably be cleaner to lift this up to Future and make it more configurable by callers. In particular, we have (or, had) some code in the S3 file storage engine which added service profiler calls to S3, but this is really a wrapped HTTP future, and the S3 profiling wasn't future aware. It would be better to just be able to mark the underlying future as "HTTP + S3" or similar, rather than doubling up calls in a sort of manual, not-entirely-correct way.

Event Timeline

epriestley triaged this task as Wishlist priority.Jul 30 2018, 6:52 PM
epriestley created this task.

We actually just ran into a situation where this would be extremely useful. We're adding a rudimentary start/end time tracing listener so we can keep tabs on our service calls. That listener uploads the data to an endpoint using an HTTPSFuture, but we want to configure it to not show up in the default echoListener (or to not be profiled at all).

epriestley closed this task as Resolved.Feb 27 2020, 3:06 PM
epriestley claimed this task.

I'm going to mark this as resolved, since:

  • The S3 use case is more or less moot, since S3 was turned into a real Future after I got tired of dealing with the dependency.
  • D21038 does the actual thing, at least more or less.

I think "keep stuff out of the trace" is a reasonable use case, but I'd like to revisit it if it's still relevant after T13490 resolves, since that's likely to change how this this kind of instrumentation (which I'm assuming is like "how often do users run arc diff, and how fast is it?") works. I currently imagine integrating Arcanist workflows into ServiceProfiler by default, so the fix to this issue might be something like:

  • Define a formal "report" phase for ServiceProfiler, which is the time we spend inside endServiceCall().
  • When we enter the report phase, set a flag to ignore reentrant calls back into ServiceProfiler (so they don't appear in the trace, etc).
  • When we exit the report phase, clear the flag. Report the "report" cost in the final echo trace.

So the default trace would go from looking like this:

>>> [26] (+736) <http> https://secure.phabricator.com/api/differential.query
>>> [27] (+382) <http> https://tracker.mycompany.com/

...to something like this:

>>> [26] (+736 / 382) <http> https://secure.phabricator.com/api/differential.query

Where the "/ 382" means "we spent 382ms reporting/processing the event" in ServiceProfiler. That would still provide a hint that your reporting is a bottleneck if you see a big number there, but wouldn't gum up the trace, and would allow extensions against ServiceProfiler to freely do things which require service invocations without reporting themselves.