Page MenuHomePhabricator

Profile GC costs in XHProf
Open, LowPublic

Description

See https://news.ycombinator.com/item?id=8686934. XHProf does not currently report time spent in the PHP garbage collector, but enabling it to report this information would potentially be useful. I'm not sure how involved this is.

Event Timeline

epriestley raised the priority of this task from to Low.
epriestley updated the task description. (Show Details)
epriestley added a project: XHProf.
epriestley added a subscriber: epriestley.

I poked at this very briefly and didn't immediately see a way to hook the GC from an extension, at least in PHP 5.4.

Pretty sure you've been across the same path as I was, but I want to share my findings just in case someone can find a way to use it:

In the documentation there is a bit about adding -DGC_BENCH=1 to CFLAGS, to enable GC statistics being printed out. This seems obviously useful, but since they're behind a compile time flag, it's not usable in any general case.

However, it might be easy enough to hook into the gc functions? I don't have enough experience with this particular part of the internals, but someone else might find something.

From the current HEAD of php-src, these three locations handle the GC statistics part:
zend_gc.h
zend_gc.c
zend.c

Hope this helps.

What I'm ideally after is gc_check_possible_root (or some similar call) to be a global function pointer which we could rebind like XHProf currently rebinds zend_execute and similar -- basically do this dance:

https://secure.phabricator.com/diffusion/XPRF/browse/master/extension/xhprof.c;91e0be91da03ec4aa2b63a991f70c1c10151cc71$1837-1861

However, all the GC stuff is directly linked (not indirected through a function pointer), so I don't think I can make the extension hook it (that said, I know my way around PHP extensions and C linking a little, but but I'm far from an expert in this arena).

One thing we might be able to do is identify opcodes which may execute the GC and count that time under some "GC + other things which should all be cheap" umbrella. I'm not sure if that's practical or how good the results would be, though. But it might give us a reasonable ceiling which could be good enough for practical purposes.

I do see that gc_runs and collected are available (via GC_G(...)) unconditionally? It's not much, but it's more than is available now.

Qafoolabs mentions a "proposed timing API" in a pull request. I can't seem to find anything about anything having been proposed, but it might be worthwhile to keep that in mind.