diff --git a/src/docs/user/field/repository_imports.diviner b/src/docs/user/field/repository_imports.diviner new file mode 100644 index 0000000000..a59bee051f --- /dev/null +++ b/src/docs/user/field/repository_imports.diviner @@ -0,0 +1,247 @@ +@title Troubleshooting Repository Imports +@group fieldmanual + +Guide to the troubleshooting repositories which import incompletely. + +Overview +======== + +When you first import an external source code repository (or push new commits to +a hosted repository), Phabricator imports those commits in the background. + +While a repository is initially importing, some features won't work. While +individual commits are importing, some of their metadata won't be available in +the web UI. + +Sometimes, the import process may hang or fail to complete. This document can +help you understand the import process and troubleshoot problems with it. + + +Understanding the Import Pipeline +================================= + +Phabricator first performs commit discovery on repositories. This examines +a repository and identifies all the commits in it at a very shallow level, +then creates stub objects for them. These stub objects primarily serve to +assign various internal IDs to each commit. + +Commit discovery occurs in the update phase, and you can learn more about +updates in @{article:Diffusion User Guide: Repository Updates}. + +After commits are discovered, background tasks are queued to actually import +commits. These tasks do things like look at commit messages, trigger mentions +and autoclose rules, cache changes, trigger Herald, publish feed stories and +email, and apply Owners rules. You can learn more about some of these steps in +@{article:Diffusion User Guide: Autoclose}. + +Specifically, the import pipeline has four steps: + + - **Message**: Parses the commit message and author metadata. + - **Change**: Caches the paths the commit affected. + - **Owners**: Runs Owners rules. + - **Herald**: Runs Herald rules and publishes notifications. + +These steps run in sequence for each commit, but all discovered commits import +in parallel. + + +Identifying Missing Steps +========================= + +There are a few major pieces of information you can look at to understand where +the import process is stuck. + +First, to identify which commits have missing import steps, run this command: + +``` +phabricator/ $ ./bin/repository importing rXYZ +``` + +That will show what work remains to be done. Each line shows a commit which +is discovered but not imported, and the import steps that are remaining for +that commit. Generally, the commit is stuck on the first step in the list. + +Second, load the Daemon Console (at `/daemon/` in the web UI). This will show +what work is currently being done and waiting to complete. The most important +sections are "Queued Tasks" (work waiting in queue) and "Leased Tasks" +(work currently being done). + +Third, run this command to look at the daemon logs: + +``` +phabricator/ $ ./bin/phd log +``` + +This can show you any errors the daemons have encountered recently. + +The next sections will walk through how to use this information to understand +and resolve the issue. + + +Handling Permanent Failures +=========================== + +Some commits can not be imported, which will permanently stop a repository from +fully importing. These are rare, but can be caused by unusual data in a +repository, version peculiarities, or bugs in the importer. + +Permanent failures usually look like a small number of commits stuck on the +"Message" or "Change" steps in the output of `repository importing`. If you +have a larger number of commits, it is less likely that there are any permanent +problems. + +In the Daemon console, permanent failures usually look like a small number of +tasks in "Leased Tasks" with a large failure count. These tasks are retrying +until they succeed, but a bug is permanently preventing them from succeeding, +so they'll rack up a large number of retries over time. + +In the daemon log, these commits usually emit specific errors showing why +they're failing to import. + +These failures are the easiest to identify and understand, and can often be +resolved quickly. Choose some failing commit from the output of `bin/repository +importing` and use this command to re-run any missing steps manually in the +foreground: + +``` +phabricator/ $ ./bin/repository reparse --importing --trace rXYZabcdef012... +``` + +This command is always safe to run, no matter what the actual root cause of +the problem is. + +If this fails with an error, you've likely identified a problem with +Phabricator. Collect as much information as you can about what makes the commit +special and file a bug in the upstream by following the instructions in +@{article:Contributing Bug Reports}. + +If the commit imports cleanly, this is more likely to be caused by some other +issue. + + +Handling Temporary Failures +=========================== + +Some commits may temporarily fail to import: perhaps the network or services +may have briefly been down, or some configuration wasn't quite right, or the +daemons were killed halfway through the work. + +These commits will retry eventually and usually succeed, but some of the retry +time limits are very conserative (up to 24 hours) and you might not want to +wait that long. + +In the Daemon console, temporarily failures usually look like tasks in the +"Leased Tasks" column with a large "Expires" value but a low "Failures" count +(usually 0 or 1). The "Expires" column is showing how long Phabricator is +waiting to retry these tasks. + +In the daemon log, these temporary failures might have created log entries, but +might also not have. For example, if the failure was rooted in a network issue +that probably will create a log entry, but if the faiulre was rooted in the +daemons being abruptly killed that may not create a log entry. + +You can follow the instructions from "Handling Permanent Failures" above to +reparse steps individually to look for an error that represents a root cause, +but sometimes this can happen because of some transient issue which won't be +identifiable. + +The easiest way to fix this is to restart the daemons. When you restart +daemons, all task leases are immediately expired, so any tasks waiting for a +long time will run right away: + +``` +phabricator/ $ ./bin/phd restart +``` + +This command is always safe to run, no matter what the actual root cause of +the problem is. + +After restarting the daemons, any pending tasks should be able to retry +immediately. + +For more information on managing the daemons, see +@{article:Managing Daemons with phd}. + + +Forced Parsing +============== + +In rare cases, the actual tasks may be lost from the task queue. Usually, they +have been stolen by gremlins or spritied away by ghosts, or someone may have +been too ambitious with running manual SQL commands and deleted a bunch of +extra things they shouldn't have. + +There is no normal set of conditions under which this should occur, but you can +force Phabricator to re-queue the tasks to recover from it if it does occur. + +This will look like missing steps in `repository importing`, but nothing in the +"Queued Tasks" or "Leased Tasks" sections of the daemon console. The daemon +logs will also be empty, since the tasks have vanished. + +To re-queue parse tasks for a repository, run this command, which will queue +up all of the missing work in `repository importing`: + +``` +phabricator/ $ ./bin/repository reparse --importing --all rXYZ +``` + +This command may cause duplicate work to occur if you have misdiagnosed the +problem and the tasks aren't actually lost. For example, it could queue a +second task to perform publishing, which could cause Phabricator to send a +second copy of email about the commit. Other than that, it is safe to run even +if this isn't the problem. + +After running this command, you should see tasks in "Queued Tasks" and "Leased +Tasks" in the console which correspond to the commits in `repository +importing`, and progress should resume. + + +Forced Imports +============== + +In some cases, you might want to force a repository to be flagged as imported +even though the import isn't complete. The most common and reasonable case +where you might want to do this is if you've identified a permanent failure +with a small number of commits (maybe just one) and reported it upstream, and +are waiting for a fix. You might want to start using the repository immediately, +even if a few things can't import yet. + +You should be cautious about doing this. The "importing" flag controls +publishing of notifications and email, so if you flag a repository as imported +but it still has a lot of work queued, it may send an enormous amount of email +as that work completes. + +To mark a repository as imported even though it really isn't, run this +command: + +``` +phabricator/ $ ./bin/repository mark-imported rXYZ +``` + +If you do this by mistake, you can reverse it later by using the +`--mark-not-imported` flag. + + +General Tips +============ + +Broadly, `bin/repository` contains several useful debugging commands which +let you figure out where failures are occuring. You can add the `--trace` flag +to any command to get more details about what it is doing. For any command, +you can use `help` to learn more about what it does and which flag it takes: + +``` +phabricator/ $ bin/repository help +``` + +In particular, you can use flags with the the `repository reparse` command to +manually run parse steps in the foreground, including re-running steps and +running steps out of order. + + +Next Steps +========== + +Continue by: + + - returning to the @{article:Diffusion User Guide}. diff --git a/src/docs/user/userguide/diffusion.diviner b/src/docs/user/userguide/diffusion.diviner index 1f4f08295f..f3fa53da3b 100644 --- a/src/docs/user/userguide/diffusion.diviner +++ b/src/docs/user/userguide/diffusion.diviner @@ -1,102 +1,116 @@ @title Diffusion User Guide @group userguide Guide to Diffusion, the Phabricator repository browser. = Overview = Diffusion is a repository browser which allows you to explore source code in a Subversion, Git, or Mercurial repository. It is somewhat similar to software like Trac and GitWeb. Diffusion can either import a read-only copy of repositories hosted somewhere else (for example, from GitHub, Bitbucket or existing hosting) or host repositories within Phabricator. Hosted repositories support a variety of triggers and access controls. Diffusion is integrated with the other tools in the Phabricator suite. For instance: - when you commit Differential revisions to a tracked repository, they are automatically updated and linked to the corresponding commits; - you can add Herald rules to notify you about commits that match certain rules; - for hosted repositories, Herald can enforce granular access control rules; - in all the tools, commit names are automatically linked. = Adding Repositories = Repository administration is accomplished through Diffusion. You can use the web interface in Diffusion to import an external repository, or create a new hosted repository. - For hosted repositories, make sure you go through the setup instructions in @{article:Diffusion User Guide: Repository Hosting} first. - For all repositories, you'll need to be running the daemons. If you have not set them up yet, see @{article:Managing Daemons with phd}. By default, you must be an administrator to create a new repository. = Repository Callsigns and Commit Names = Each repository is identified by a "callsign", which is a short uppercase string like "P" (for Phabricator) or "ARC" (for Arcanist). Each repository must have a unique callsign. Callsigns must be unique within an install but do not need to be globally unique, so you are free to use the single-letter callsigns for brevity. For example, Facebook uses "E" for the Engineering repository, "O" for the Ops repository, "Y" for a Yum package repository, and so on, while Phabricator uses "P", "ARC", "PHU" for libphutil, and "J" for Javelin. Keeping callsigns brief will make them easier to use, and the use of one-character callsigns is recommended if they are reasonably evocative and you have no more than 26 tracked repositories. The primary goal of callsigns is to namespace commits to SVN repositories: if you use multiple SVN repositories, each repository has a revision 1, revision 2, etc., so referring to them by number alone is ambiguous. However, even for Git they impart additional information to human readers and allow parsers to detect that something is a commit name with high probability (and allow distinguishing between multiple copies of a repository). Diffusion uses this callsign and information about the commit itself to generate a commit name, like "rE12345" or "rP28146171ce1278f2375e3646a1e1ea3fd56fc5a3". The "r" stands for "revision". It is followed by the repository callsign, and then a VCS-specific commit identifier (for SVN, the commit number; for Git and Mercurial, the commit hash). When writing the name of a Git commit you may abbreviate the hash, but note that hash collisions are probable for short prefix lengths. See this post on the LKML for a historical explanation of Git's occasional internal use of 7-character hashes: https://lkml.org/lkml/2010/10/28/287 Because 7-character hashes are likely to collide for even moderately large repositories, Diffusion generally uses either a 16-character prefix (which makes collisions very unlikely) or the full 40-character hash (which makes collisions astronomically unlikely). = Running Diffusion Daemons = In most cases, it is sufficient to run: phabricator/bin/ $ ./phd start ...to start the daemons. For a more in-depth explanation of `phd` and daemons, see @{article:Managing Daemons with phd}. NOTE: If you have an unusually large install with multiple web frontends, see notes in @{article:Managing Daemons with phd}. You can use the repository detail screen and the Daemon Console to monitor the daemons and their progress importing the repository. Small repositories should import quickly, while larger repositories may take some time. Commits should begin appearing in Diffusion within a few minutes for all but the largest repositories. = Next Steps = - - Learn about creating a symbol index at +Continue by: + + - learning how to creating a symbol index at @{article:Diffusion User Guide: Symbol Indexes}; or - - set up repository hosting with + - setting up repository hosting with @{article:Diffusion User Guide: Repository Hosting}; or - - understand daemons in detail with @{article:Managing Daemons with phd}; or - - give us feedback at @{article:Give Feedback! Get Support!}. + - managing repository hooks with + @{article:Diffusion User Guide: Commit Hooks}; or + - understanding daemons in more detail with + @{article:Managing Daemons with phd}. + +If you're having trouble getting things working, these topic guides may be +helpful: + + - get details about automatically closing tasks and revisions in response + to commits in @{article:Diffusion User Guide: Autoclose}; or + - understand how Phabricator updates repositories with + @{article:Diffusion User Guide: Repository Updates}; or + - fix issues with repository imports with + @{article:Troubleshooting Repository Imports}. diff --git a/src/docs/user/userguide/diffusion_autoclose.diviner b/src/docs/user/userguide/diffusion_autoclose.diviner index adbc753aad..6d6f76bf87 100644 --- a/src/docs/user/userguide/diffusion_autoclose.diviner +++ b/src/docs/user/userguide/diffusion_autoclose.diviner @@ -1,60 +1,68 @@ @title Diffusion User Guide: Autoclose @group userguide Explains when Diffusion will close tasks and revisions upon discovery of related commits. Overview ======== Diffusion can close tasks and revisions when related commits appear in a repository. For example, if you make a commit with `Fixes T123` in the commit message, Diffusion will close the task `T123`. This document explains how autoclose works, how to configure it, and how to troubleshoot it. Troubleshooting Autoclose ========================= You can check if a branch is currently configured to autoclose on the main repository view, or in the branches list view. Hover over the {icon check} or {icon times} icon and you should see one of these messages: - {icon check} **Autoclose Enabled** Autoclose is active for this branch. - {icon times} **Repository Importing** This repository is still importing. Autoclose does not activate until a repository finishes importing for the first time. This prevents situations where you import a repository and accidentally close hundreds of related objects during import. Autoclose will activate for new commits after the initial import completes. - {icon times} **Repository Autoclose Disabled** Autoclose is disabled for this entire repository. You can enable it in **Edit Repository**. - {icon times} **Branch Untracked** This branch is not tracked. Because it is not tracked, commits on it won't be seen and won't be discovered. - {icon times} **Branch Autoclose Disabled** Autoclose is not enabled for this branch. You can adjust which branches autoclose in **Edit Repository**. This option is only available in Git. If a branch is in good shape, you can check a specific commit by viewing it in the web UI and clicking **Edit Commit**. There should be an **Autoclose?** field visible in the form, with possible values listed below. Note that this field records the state of the world at the time the commit was processed, and does not necessarily reflect the current state of the world. For example, if a commit did not trigger autoclose because it was processed during initial import, the field will still show **No, Repository Importing** even after import completes. This means that the commit did not trigger autoclose because the repository was importing at the time it was processed, not necessarily that the repository is still importing. - **Yes** At the time the commit was imported, autoclose triggered and Phabricator attempted to close related objects. - **No, Repository Importing** At the time the commit was processed, the repository was still importing. Autoclose does not activate until a repository fully imports for the first time. - **No, Autoclose Disabled** At the time the commit was processed, the repository had autoclose disabled. - **No, Not On Autoclose Branch** At the time the commit was processed, no containing branch was configured to autoclose. - //Field Not Present// This commit was processed before we implemented this diagnostic feature, and no information is available. + +Next Steps +========== + +Continue by: + + - troubleshooting in greater depth with + @{article:Troubleshooting Repository Imports}. diff --git a/src/docs/user/userguide/diffusion_updates.diviner b/src/docs/user/userguide/diffusion_updates.diviner index 6b19c14480..3dae59b3a5 100644 --- a/src/docs/user/userguide/diffusion_updates.diviner +++ b/src/docs/user/userguide/diffusion_updates.diviner @@ -1,115 +1,123 @@ @title Diffusion User Guide: Repository Updates @group userguide Explains how Diffusion updates repositories to discover new changes. Overview ======== When Phabricator is configured to import repositories which are hosted elsewhere, it needs to poll those repositories for changes. If it polls too frequently, it can create too much load locally and on remote services. If it polls too rarely, it may take a long time for commits to show up in the web interface. This document describes the rules around polling and how to understand and adjust the behavior. In general: - Phabricator chooses a default poll interval based on repository activity. These intervals range from every 15 seconds (for active repositories) to every 6 hours (for repositories with no commits in two months). - If you use `arc` to push commits, or you host repositories on Phabricator, repositories automatically update after changes are pushed. - If you don't use `arc` and your repository is hosted elsewhere, this document describes ways you can make polling more responsive. Default Behavior ================ By default, Phabricator determines how frequently to poll repositories by examining how long it has been since the last commit. In most cases this is fairly accurate and produces good behavior. In particular, it automatically reduces the polling frequency for rarely-used repositories. This dramatically reduces load for installs with a large number of inactive repositories, which is common. For repositories with activity in the last 3 days, we wait 1 second for every 10 minutes without activity. The table below has some examples. | Time Since Commit | Poll Interval | |-------------------|------------------| | //Minimum// | 15 seconds | | 6h | about 30 seconds | | 12h | about 1 minute | | 1 day | about 2 minutes | | 2 days | about 5 minutes | | 3 days | about 7 minutes | This means that you may need to wait about 2 minutes for the first commit to be imported in the morning, and about 5 minutes after a long weekend, but other commits to active repositories should usually be recognized in 30 seconds or less. For repositories with no activity in the last 3 days, we wait longer between updates (1 second for every 4 minutes without activity). The table below has some examples. | Time Since Commit | Poll Interval | |-------------------|------------------| | 4 days | about 30 minutes | | 7 days | about 45 minutes | | 10 days | about 1 hour | | 20 days | about 2 hours | | 30 days | about 3 hours | | //Maximum// | 6 hours | You can find the exact default poll frequency of a repository in Diffusion > (Choose a Repository) > Edit Repository, under "Update Frequency". You can also see the time when the repository was last updated in this interface. Repositories that are currently importing are always updated at the minimum update frequency so the import finishes as quickly as possible. Triggering Repository Updates ============================= If you want Phabricator to update a repository more quickly than the default update frequency (for example, because you just pushed a commit to it), you can tell Phabricator that it should schedule an update as soon as possible. There are several ways to do this: - If you push changes with `arc land` or `arc commit`, this will be done for you automatically. These commits should normally be recognized within a few seconds. - If your repository is hosted on Phabricator, this will also be done for you automatically. - You can schedule an update from the web interface, in Diffusion > (Choose a Repository) > Edit Repository > Update Now. - You can make a call to the Conduit API method `diffusion.looksoon`. This hints to Phabricator that it should poll a repository as soon as it can. All of the other mechanisms do this under the hood. In particular, you may be able to add a commit hook to your external repository which calls `diffusion.looksoon`. This should make an external repository about as responsive as a hosted repository. If a repository has an update scheduled, the Diffusion > (Choose a Repository) > Edit Repository interface will show that the the repository is prioritized and will be updated soon. Troubleshooting Updates ======================= You can manually run a repository update from the command line to troubleshoot issues, using the `--trace` flag to get full details: phabricator/ $ ./bin/repository update --trace To catch potential issues with permissions, run this command as the same user that the daemons run as. + +Next Steps +========== + +Continue by: + + - troubleshooting in greater depth with + @{article:Troubleshooting Repository Imports}.