diff --git a/src/applications/multimeter/application/PhabricatorMultimeterApplication.php b/src/applications/multimeter/application/PhabricatorMultimeterApplication.php --- a/src/applications/multimeter/application/PhabricatorMultimeterApplication.php +++ b/src/applications/multimeter/application/PhabricatorMultimeterApplication.php @@ -43,4 +43,13 @@ ); } + public function getHelpDocumentationArticles(PhabricatorUser $viewer) { + return array( + array( + 'name' => pht('Multimeter User Guide'), + 'href' => PhabricatorEnv::getDoclink('Multimeter User Guide'), + ), + ); + } + } diff --git a/src/docs/book/user.book b/src/docs/book/user.book --- a/src/docs/book/user.book +++ b/src/docs/book/user.book @@ -28,6 +28,9 @@ }, "userguide": { "name": "Application User Guides" + }, + "fieldmanual": { + "name": "Field Manuals" } } } diff --git a/src/docs/contributor/darkconsole.diviner b/src/docs/contributor/darkconsole.diviner deleted file mode 100644 --- a/src/docs/contributor/darkconsole.diviner +++ /dev/null @@ -1,60 +0,0 @@ -@title Using DarkConsole -@group developer - -Enabling and using the built-in debugging console. - -= Overview = - -DarkConsole is a debugging console built into Phabricator which exposes -configuration, performance and error information. It can help you detect, -understand and resolve bugs and performance problems in Phabricator -applications. - -DarkConsole was originally implemented as part of the Facebook Lite site; its -name is a bit of play on that (and a reference to the dark color palette its -design uses). - -= Warning = - -Because DarkConsole exposes some configuration and debugging information, it is -disabled by default (and **you should not enable it in production**). It has -some simple safeguards to prevent leaking credential information, but enabling -it in production may compromise the integrity of an install. - -= Enabling DarkConsole = - -You enable DarkConsole in your configuration, by setting `darkconsole.enabled` -to `true`, and then turning it on in `Settings` -> `Developer Settings`. Once -DarkConsole is enabled, you can show or hide it by pressing ##`## on your -keyboard. - -Since the setting is not available to logged-out users, you can also set -`darkconsole.always-on` if you need to access DarkConsole on logged-out pages. - -DarkConsole has a number of tabs, each of which is powered by a "plugin". You -can use them to access different debugging and performance features. - -= Plugin: Error Log = - -The "Error Log" plugin shows errors that occurred while generating the page, -similar to the httpd `error.log`. You can send information to the error log -explicitly with the @{function@libphutil:phlog} function. - -If errors occurred, a red dot will appear on the plugin tab. - -= Plugin: Request = - -The "Request" plugin shows information about the HTTP request the server -received, and the server itself. - -= Plugin: Services = - -The "Services" plugin lists calls a page made to external services, like -MySQL and the command line. - -= Plugin: XHProf = - -The "XHProf" plugin gives you access to the XHProf profiler. To use it, you need -to install the corresponding PHP plugin -- see instructions in the -@{article:Installation Guide}. Once it is installed, you can use XHProf to -profile the runtime performance of a page. diff --git a/src/docs/contributor/installing_xhprof.diviner b/src/docs/contributor/installing_xhprof.diviner deleted file mode 100644 --- a/src/docs/contributor/installing_xhprof.diviner +++ /dev/null @@ -1,54 +0,0 @@ -@title Installing XHProf -@group developer - -Describes how to install XHProf, a PHP profiling tool. - -Overview -======== - -You can install XHProf to activate the XHProf tab in DarkConsole and the -`--xprofile` flag from the CLI. This will allow you to generate performance -profiles of pages and scripts, which can be tremendously valuable in identifying -and fixing slow code. - -Installing XHProf -================= - -XHProf is a PHP profiling tool. You don't need to install it unless you are -developing Phabricator and making performance changes. - -You can install xhprof with: - - $ pecl install xhprof - -If you have a PEAR version prior to 1.9.3, you may run into a `phpize` failure. -If so, you can download the source and build it with: - - $ cd extension/ - $ phpize - $ ./configure - $ make - $ sudo make install - -You may also need to add `extension=xhprof.so` to your php.ini. - -See for more information. - -Using XHProf: Web -================= - -To profile a web page, activate DarkConsole and navigate to the XHProf tab. -Use the **Profile Page** button to generate a profile. - -Using XHProf: CLI -================= - -From the command line, use the `--xprofile ` flag to generate a -profile of any script. - -Next Steps -========== - -Continue by: - - - enabling DarkConsole with @{article:Using DarkConsole}. diff --git a/src/docs/user/field/darkconsole.diviner b/src/docs/user/field/darkconsole.diviner new file mode 100644 --- /dev/null +++ b/src/docs/user/field/darkconsole.diviner @@ -0,0 +1,162 @@ +@title Using DarkConsole +@group fieldmanual + +Enabling and using the built-in debugging and performance console. + +Overview +======== + +DarkConsole is a debugging console built into Phabricator which exposes +configuration, performance and error information. It can help you detect, +understand and resolve bugs and performance problems in Phabricator +applications. + + +Security Warning +================ + +WARNING: Because DarkConsole exposes some configuration and debugging +information, it is disabled by default and you should be cautious about +enabling it in production. + +Particularly, DarkConsole may expose some information about your session +details or other private material. It has some crude safeguards against this, +but does not completely sanitize output. + +This is mostly a risk if you take screenshots or copy/paste output and share +it with others. + + +Enabling DarkConsole +==================== + +You enable DarkConsole in your configuration, by setting `darkconsole.enabled` +to `true`, and then turning it on in {nav Settings > Developer Settings}. + +Once DarkConsole is enabled, you can show or hide it by pressing ##`## on your +keyboard. + +Since the setting is not available to logged-out users, you can also set +`darkconsole.always-on` if you need to access DarkConsole on logged-out pages. + +DarkConsole has a number of tabs, each of which is powered by a "plugin". You +can use them to access different debugging and performance features. + + +Plugin: Error Log +================= + +The "Error Log" plugin shows errors that occurred while generating the page, +similar to the httpd `error.log`. You can send information to the error log +explicitly with the @{function@libphutil:phlog} function. + +If errors occurred, a red dot will appear on the plugin tab. + + +Plugin: Request +=============== + +The "Request" plugin shows information about the HTTP request the server +received, and the server itself. + + +Plugin: Services +================ + +The "Services" plugin lists calls a page made to external services, like +MySQL and subprocesses. + +The Services tab can help you understand and debug issues related to page +behavior: for example, you can use it to see exactly what queries or commands a +page is running. In some cases, you can re-run those queries or commands +yourself to examine their output and look for problems. + +This tab can also be particularly useful in understanding page performance, +because many performance problems are caused by inefficient queries (queries +with bad query plans or which take too long) or repeated queries (queries which +could be better structured or benefit from caching). + +When analyzing performance problems, the major things to look for are: + +**Summary**: In the summary table at the top of the tab, are any categories +of events dominating the performance cost? For normal pages, the costs should +be roughly along these lines: + +| Event Type | Approximate Cost | +|---|---| +| Connect | 1%-10% | +| Query | 10%-40% | +| Cache | 1% | +| Event | 1% | +| Conduit | 0%-80% | +| Exec | 0%-80% | +| All Services | 10%-75% | +| Entire Page | 100ms - 1000ms | + +These ranges are rough, but should usually be what you expect from a page +summary. If any of these numbers are way off (for example, "Event" is taking +50% of runtime), that points toward a possible problem in that section of the +code, and can guide you to examining the related service calls more carefully. + +**Duration**: In the Duration column, look for service calls that take a long +time. Sometimes these calls are just what the page is doing, but sometimes they +may indicate a problem. + +Some questions that may help understanding this column are: are there a small +number of calls which account for a majority of the total page generation time? +Do these calls seem fundamental to the behavior of the page, or is it not clear +why they need to be made? Do some of them seem like they could be cached? + +If there are queries which look slow, using the "Analyze Query Plans" button +may help reveal poor query plans. + +Generally, this column can help pinpoint these kinds of problems: + + - Queries or other service calls which are huge and inefficient. + - Work the page is doing which it could cache instead. + - Problems with network services. + - Missing keys or poor query plans. + +**Repeated Calls**: In the "Details" column, look for service calls that are +being made over and over again. Sometimes this is normal, but usually it +indicates a call that can be batched or cached. + +Some things to look for are: are similar calls being made over and over again? +Do calls mostly make sense given what the page is doing? Could any calls be +cached? Could multiple small calls be collected into one larger call? Are any +of the service calls clearly goofy nonsense that shouldn't be happening? + +Generally, this column can help pinpoint these kinds of problems: + + - Unbatched queries which should be batched (see + @{article:Performance: N+1 Query Problem}). + - Opportunities to improve performance with caching. + - General goofiness in how service calls are woking. + +If the services tab looks fine, and particularly if a page is slow but the +"All Services" cost is small, that may indicate a problem in PHP. The best +tool to understand problems in PHP is XHProf. + + +Plugin: XHProf +============== + +The "XHProf" plugin gives you access to the XHProf profiler. To use it, you need +to install the corresponding PHP plugin. + +Once it is installed, you can use XHProf to profile the runtime performance of +a page. This will show you a detailed breakdown of where PHP spent time. This +can help find slow or inefficient application code, and is the most powerful +general-purpose performance tool available. + +For instructions on installing and using XHProf, see @{article:Using XHProf}. + + +Next Steps +========== + +Continue by: + + - installing XHProf with @{article:Using XHProf}; or + - understanding and reporting performance issues with + @{article:Troubleshooting Performance Problems}. diff --git a/src/docs/user/field/performance.diviner b/src/docs/user/field/performance.diviner new file mode 100644 --- /dev/null +++ b/src/docs/user/field/performance.diviner @@ -0,0 +1,179 @@ +@title Troubleshooting Performance Problems +@group fieldmanual + +Guide to the troubleshooting slow pages and hangs. + +Overview +======== + +This document describes how to isolate, examine, understand and resolve or +report performance issues like slow pages and hangs. + +This document covers the general process for handling performance problems, +and outlines the major tools available for understanding them: + + - **Multimeter** helps you understand sources of load and broad resource + utilization. This is a coarse, high-level tool. + - **DarkConsole** helps you dig into a specific slow page and understand + service calls. This is a general, mid-level tool. + - **XHProf** gives you detailed application performance profiles. This + is a fine-grained, low-level tool. + +Performance and the Upstream +============================ + +Performance issues and hangs will often require upstream involvement to fully +resolve. The intent is for Phabricator to perform well in all reasonable cases, +not require tuning for different workloads (as long as those workloads are +generally reasonable). Poor performance with a reasonable workload is likely a +bug, not a configuration problem. + +However, some pages are slow because Phabricator legitimately needs to do a lot +of work to generate them. For example, if you write a 100MB wiki document, +Phabricator will need substantial time to process it, it will take a long time +to download over the network, and your browser will proably not be able to +render it especially quickly. + +We may be able to improve perfomance in some cases, but Phabricator is not +magic and can not wish away real complexity. The best solution to these problems +is usually to find another way to solve your problem: for example, maybe the +100MB document can be split into several smaller documents. + +Here are some examples of performance problems under reasonable workloads that +the upstream can help resolve: + + - {icon check, color=green} Commenting on a file and mentioning that same + file results in a hang. + - {icon check, color=green} Creating a new user takes many seconds. + - {icon check, color=green} Loading Feed hangs on 32-bit systems. + +The upstream will be less able to help resolve unusual workloads with high +inherent complexity, like these: + + - {icon times, color=red} A 100MB wiki page takes a long time to render. + - {icon times, color=red} A turing-complete simulation of Conway's Game of + Life implented in 958,000 Herald rules executes slowly. + - {icon times, color=red} Uploading an 8GB file takes several minutes. + +Generally, the path forward will be: + + - Follow the instructions in this document to gain the best understanding of + the issue (and of how to reproduce it) that you can. + - In particular, is it being caused by an unusual workload (like a 100MB + wiki page)? If so, consider other ways to solve the problem. + - File a report with the upstream by following the instructions in + @{article:Contributing Bug Reports}. + +The remaining sections in this document walk through these steps. + + +Understanding Performance Problems +================================== + +To isolate, examine, and understand performance problems, follow these steps: + +**General Slowness**: If you are experiencing generally poor performance, use +Multimeter to understand resource usage and look for load-based causes. See +@{article:Multimeter User Guide}. If that isn't fruitful, treat this like a +reproducible performance problem on an arbitrary page. + +**Hangs**: If you are experiencing hangs (pages which never return, or which +time out with a fatal after some number of seconds), they are almost always +the result of bugs in the upstream. Report them by following these +instructions: + + - Set `debug.time-limit` to a value like `5`. + - Reproduce the hang. The page should exit after 5 seconds with a more useful + stack trace. + - File a report with the reproduction instructions and the stack trace in + the upstream. See @{article:Contributing Bug Reports} for detailed + instructions. + - Clear `debug.time-limit` again to take your install out of debug mode. + +If part of the reproduction instructions include "Create a 100MB wiki page", +the upstream may be less sympathetic to your cause than if reproducing the +issue does not require an unusual, complex workload. + +In some cases, the hang may really just a very large amount of processing time. +If you're very excited about 100MB wiki pages and don't mind waiting many +minutes for them to render, you may be able to adjust `max_execution_time` in +your PHP configuration to allow the process enough time to complete, or adjust +settings in your webserver config to let it wait longer for results. + +**DarkConsole**: If you have a reproducible performance problem (for example, +loading a specific page is very slow), you can enable DarkConsole (a builtin +debugging console) to examine page performance in detail. + +The two most useful tabs in DarkConsole are the "Services" tab and the +"XHProf" tab. + +The "Services" module allows you to examine service calls (network calls, +subprocesses, events, etc) and find slow queries, slow services, inefficient +query plans, and unnecessary calls. Broadly, you're looking for slow or +repeated service calls, or calls which don't make sense given what the page +should be doing. + +After installing XHProf (see @{article:Using XHProf}) you'll gain access to the +"XHProf" tab, which is a full tracing profiler. You can use the "Profile Page" +button to generate a complete trace of where a page is spending time. When +reading a profile, you're looking for the overall use of time, and for anything +which sticks out as taking unreasonably long or not making sense. + +See @{article:Using DarkConsole} for complete instructions on configuring +and using DarkConsole. + +**AJAX Requests**: To debug Ajax requests, activate DarkConsole and then turn +on the profiler or query analyzer on the main request by clicking the +appropriate button. The setting will cascade to Ajax requests made by the page +and they'll show up in the console with full query analysis or profiling +information. + +**Command-Line Hangs**: If you have a script or daemon hanging, you can send +it `SIGHUP` to have it dump a stack trace to `sys_get_temp_dir()` (usually +`/tmp`). + +Do this with: + +``` +$ kill -HUP +``` + +You can use this command to figure out where the system's temporary directory +is: + +``` +$ php -r 'echo sys_get_temp_dir()."\n";' +``` + +On most systems, this is `/tmp`. The trace should appear in that directory with +a name like `phabricator_backtrace_`. Examining this trace may provide +a key to understanding the problem. + +**Command-Line Performance**: If you have general performance issues with +command-line scripts, you can add `--trace` to see a service call log. This is +similar to the "Services" tab in DarkConsole. This may help identify issues. + +After installing XHProf, you can also add `--xprofile ` to emit a +detailed performance profile. You can `arc upload` these files and then view +them in XHProf from the web UI. + +Next Steps +========== + +If you've done all you can to isolate and understand the problem you're +experiencing, report it to the upstream. Including as much relevant data as +you can, including: + + - reproduction instructions; + - traces from `debug.time-limit` for hangs; + - screenshots of service call logs from DarkConsole (review these carefully, + as they can sometimes contain sensitive information); + - traces from CLI scripts with `--trace`; + - traces from sending HUP to processes; and + - XHProf profile files from `--xprofile` or "Download .xhprof Profile" in + the web UI. + +After collecting this information: + + - follow the instructions in @{article:Contributing Bug Reports} to file + a report in the upstream. diff --git a/src/docs/user/field/xhprof.diviner b/src/docs/user/field/xhprof.diviner new file mode 100644 --- /dev/null +++ b/src/docs/user/field/xhprof.diviner @@ -0,0 +1,122 @@ +@title Using XHProf +@group fieldmanual + +Describes how to install and use XHProf, a PHP profiling tool. + +Overview +======== + +XHProf is a profiling tool which will let you understand application +performance in Phabricator. + +After you install XHProf, you can use it from the web UI and the CLI to +generate detailed performance profiles. It is the most powerful tool available +for understanding application performance and identifying and fixing slow code. + +Installing XHProf +================= + +You are likely to have the most luck building XHProf from source: + + $ git clone https://github.com/phacility/xhprof.git + +From any source distribution of the extension, build and install it like this: + + $ cd xhprof/ + $ cd extension/ + $ phpize + $ ./configure + $ make + $ sudo make install + +You may also need to add `extension=xhprof.so` to your php.ini. + +You can also try using PECL to install it, but this may not work well with +recent versions of PHP: + + $ pecl install xhprof + +Once you've installed it, `php -i` should report it as installed (you may +see a different version number, which is fine): + + $ php -i | grep xhprof + ... + xhprof => 0.9.2 + ... + + +Using XHProf: Web UI +==================== + +To profile a web page, activate DarkConsole and navigate to the XHProf tab. +Use the **Profile Page** button to generate a profile. + +For instructions on activating DarkConsole, see @{article:Using DarkConsole}. + + +Using XHProf: CLI +================= + +From the command line, use the `--xprofile ` flag to generate a +profile of any script. + +You can then upload this file to Phabricator (using `arc upload` may be easiest) +and view it in the web UI. + + +Analyzing Profiles +================== + +Understanding profiles is as much art as science, so be warned that you may not +make much headway. Even if you aren't able to conclusively read a profile +yourself, you can attach profiles when submitting bug reports to the upstream +and we can look at them. This may yield new insight. + +When looking at profiles, the "Wall Time (Inclusive)" column is usually the +most important. This shows the total amount of time spent in a function or +method and all of its children. Usually, to improve the performance of a page, +we're trying to find something that's slow and make it not slow: this column +can help identify which things are slowest. + +The "Wall Time (Exclusive)" column shows time spent in a function or method, +excluding time spent in its children. This can give you hint about whether the +call itself is slow or it's just making calls to other things that are slow. + +You can also get a sense of this by clicking a call to see its children, and +seeing if the bulk of runtime is spent in a child call. This tends to indicate +that you're looking at a problem which is deeper in the stack, and you need +to go down further to identify and understand it. + +Conversely, if the "Wall Time (Exclusive)" column is large, or the children +of a call are all cheap, there's probably something expesive happening in the +call itself. + +The "Count" column can also sometimes tip you off that something is amiss, if +a method which shouldn't be called very often is being called a lot. + +Some general thing to look for -- these aren't smoking guns, but are unusual +and can lead to finding a performance issue: + + - Is a low-level utility method like `phutil_utf8ize()` or `array_merge()` + taking more than a few percent of the page runtime? + - Do any methods (especially high-level methods) have >10,00 calls? + - Are we spending more than 100ms doing anything which isn't loading data + or rendering data? + - Does anything look suspiciously expensive or out of place? + - Is the profile for the slow page a lot different than the profile for a + fast page? + +Some performance problems are obvious and will jump out of a profile; others +may require a more nuanced understanding of the codebase to sniff out which +parts are suspicious. If you aren't able to make progress with a profile, +report the issue upstream and attach the profile to your report. + + +Next Steps +========== + +Continue by: + + - enabling DarkConsole with @{article:Using DarkConsole}; or + - understanding and reporting performance problems with + @{article:Troubleshooting Performance Problems}. diff --git a/src/docs/user/userguide/multimeter.diviner b/src/docs/user/userguide/multimeter.diviner new file mode 100644 --- /dev/null +++ b/src/docs/user/userguide/multimeter.diviner @@ -0,0 +1,99 @@ +@title Multimeter User Guide +@group userguide + +Using Multimeter, a sampling profiler. + +Overview +======== + +IMPORTANT: This document describes a prototype application. + +Multimeter is a sampling profiler that can give you coarse information about +Phabricator resource usage. In particular, it can help quickly identify sources +of load, like bots or scripts which are making a very large number of requests. + +Configuring and Using Multimeter +================================ + +To access Multimeter, go to {nav Applications > Multimeter}. + +By default, Multimeter samples 0.1% of pages. This should be a reasonable rate +for most installs, but you can increase or decrease the rate by adjusting +`debug.sample-rate`. Increasing the rate (by setting the value to a lower +number, like 100, to sample 1% of pages) will increase the granualrity of the +data, at a small performance cost. + +Using Multimeter +================ + +Multimeter shows you what Phabricator has spent time doing recently. By +looking at the samples it collects, you can identify major sources of load +or resource use, whether they are specific users, pages, subprocesses, or +other types of activity. + +By identifying and understanding unexpected load, you can adjust usage patterns +or configuration to make better use of resources (for example, rewrite bots +that are making too many calls), or report specific, actionable issues to the +upstream for resolution. + +The main screen of Multimeter shows you everything Phabricator has spent +resources on recently, broken down by action type. Categories are folded up +by default, with "(All)" labels. + +To filter by a dimension, click the link for it. For example, from the main +page, you can click "Web Request" to filter by only web requests. To expand a +grouped dimension, click the "(All)" link. + +For example, suppose we suspect that someone is running a bot that is making +a lot of requests and consuming a lot of resources. We can get a better idea +about this by filtering the results like this: + + - Click {nav Web Request}. This will show only web requests. + - Click {nav (All)} under "Viewer". This will expand events by viewer. + +Recent resource costs for web requests are now shown, grouped and sorted by +user. The usernames in the "Viewer" column show who is using resources, in +order from greatest use to least use (only administrators can see usernames). + +The "Avg" column shows the average cost per event, while the "Cost" column +shows the total cost. + +If the top few users account for similar costs and are normal, active users, +there may be nothing amiss and your problem might lie elsewhere. If a user like +`slowbot` is in the top few users and has way higher usage than anyone else, +there might be a script running under that account consuming a disproportionate +amount of resources. + +Assuming you find a user with unusual usage, you could dig into their usage +like this: + + - Click their name (like {nav slowbot}) to filter to just their requests. + - Click {nav (All)} under "Label". This expands by request detail. + +This will show exactly what they spent those resources doing, and can help +identify if they're making a lot of API calls or scraping the site or whatever +else. + +This is just an example of a specific kind of problem that Multimeter could +help resolve. In general, exploring Multimeter data by filtering and expanding +resource uses can help you understand how resources are used and identify +unexpected uses of resources. For example: + + - Identify a problem with load balancing by filtering on {nav Web Request} + and expanding on {nav Host}. If hosts aren't roughly even, DNS or a load + balancer are misconfigured. + - Identify which pages cost the most by filtering on {nav Web Request} + and expanding on {nav Label}. + - Find outlier pages by filtering on {nav Web Request} and expanding on + {nav ID}. + - Find where subprocess are invoked from by filtering on {nav Subprocesses}, + then expanding on {nav Context}. + + +Next Steps +========== + +Continue by: + + - understanding and reporting performance issues with + @{article:Troubleshooting Performance Problems}.