Make working-copy operations service-oriented
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	epriestley
	Mar 20 2013, 12:11 PM

Description

We currently rely on being able to execute git, svn and hg operations directly from the web. This won't work with a "web tier" model, because not every web machine will have every working copy. (We could mount the disks on every machine as readonly, but I have awful experiences with network mounts and am reasonably confident that mounting 1000 disks on 100 machines will explode in a ball of fire.)

Instead, we can bind installs to specific machines in a repository pool, and have the web tier make service calls (via Conduit) to the repository tier.

This basically means turning all the Diffusion*Query classes into Conduit methods, and making Diffusion use ConduitCall to invoke them.

Then we shove some kind of routing magic into ConduitCall so it can resolve selected calls remotely.

Revisions and Commits

rPHU libphutil
	D11043	rPHUbbc4b860710f Add IP range / CIDR utilities to libphutil
	D10960	rPHU34b7825ae1da Allow ConduitClient to specify an explicit Host
rP Phabricator
	D11874	rP7150aa8e192e Use Conduit in PhabricatorRepositoryGitCommitChangeParserWorker
	D14969	rPda3963b009ce Convert a low-level VCS query in Diff extraction to a Conduit call
	D11253	rPbdd7a35b3094 Remove direct calls to LowLevelCommitQuery
	D11477	rP8d087ae738d8 Remove 'initFromConduit' option from Diffusion
	D11476	rPd98eb2c8b809 Provide a fast path for resolving repository refs
	D10403	rP4f4dc9c83e8a Update PhabricatorRepositoryManagementLookupUsersWorkflow to use ConduitCall
	D11159	rPfa7bb8ff7a50 Add `cluster.addresses` and require membership before accepting cluster…
	D11158	rPc84b9d408cb5 Add `bin/almanac register` to associate a host with an Almanac device and trust…
	D11099	rPcae8c49745cc Fix diffusion.readmequery to work in a cluster enviroment
	D11102	rP8c4f3edd8ad8 Skip some repository checks in cluster enviornments
	D11100	rP376729b44c88 Don't check "repository.default-local-path" for readability in a cluster…
	D11003	rPcd6f67ef95a3 When repository services are available, use them when creating a new repository
	D11001	rPd8739459f6f9 Rename "Local" settings in Diffusion to "Storage"
	D10990	rPf18ee5c237fe Generate and use "cluster" Conduit API tokens
	D10982	rP4505724cc4f1 Allow repositories to be bound to an AlmanacService
	D10959	rPdb51d7d92a42 Make ConduitCall always local/in-process
	D10424	rPac4247ea59eb Provide more information from diffusion.querycommits
	D10399	rPd7f51325e3ab Populate results of DiffusionQueryCommitsConduitAPIMethod with…
	D7808	rP9c938701c3dd Modernize Diffusion `commitparentsquery`

Related Objects
Search...

Status	Assigned	Task
		Restricted Maniphest Task
Resolved	epriestley	T1315 Phacility Launch Status
Resolved	epriestley	T2772 Phacility (Blockers)
Duplicate	epriestley	T4209 Multiserver / High-Availability Configuration
Resolved	epriestley	T10751 Make Phabricator Highly Available
Resolved	epriestley	T4292 Implement repository replication
Resolved	epriestley	T2783 Make working-copy operations service-oriented
		Restricted Maniphest Task
		Restricted Maniphest Task
		Restricted Maniphest Task
Resolved	epriestley	T6240 Implement Conduit request signing for host-to-host calls
Resolved	None	T7019 Proxy HTTP VCS traffic
Resolved	None	T7020 Proxy Diffusion Conduit requests
		Restricted Maniphest Task
Resolved	epriestley	T10366 General support for multiple URIs for a repository
Resolved	epriestley	T10860 After an inconsistent cluster repository write, consider just ignoring the lock
Open	epriestley	T10861 Provide a tool to rewind the push log for a repository

Event Timeline

epriestley triaged this task as Normal priority.Mar 20 2013, 12:11 PM

epriestley added a project: Phacility.

epriestley added a subscriber: epriestley.

epriestley edited this Maniphest Task.Mar 20 2013, 12:11 PM

epriestley edited this Maniphest Task.Mar 20 2013, 12:19 PM

epriestley edited this Maniphest Task.

epriestley edited this Maniphest Task.Dec 20 2013, 12:01 AM

epriestley edited this Maniphest Task.Dec 20 2013, 12:04 AM

epriestley edited this Maniphest Task.Dec 20 2013, 8:39 PM

joshuaspence added a subscriber: joshuaspence.Jun 27 2014, 1:17 AM

The description here is still accurate. T2784 fixed many of the problems here, but we have some remaining callsites outside of the web code that task focused on.

Roughly, we have code in various places that relies on having a repository working copy on disk. We'd like to allow the working copy to be on another machine instead and still have everything work. To do this, we need to convert direct repository access to Conduit calls. Once this is done, we'll be able to have web frontends make service requests to the host which actually has the repository in order to satisfy repository queries.

It's OK to directly access repositories in these cases:

Anything on the PullLocal daemon (notably: DiscoveryEngine, RefEngine).
From Conduit calls.
From Queries which are only called by Conduit calls.
From CommitHookEngine.
From HeraldPreCommitContentAdapter.
A couple of special cases.

Other uses are improper, and need to be converted to Conduit calls so they'll work once we split repositories across machines. It looks like the state of the world today is:

Improper uses of DiffusionLowLevelCommitQuery:

PhabricatorRepositoryManagementLookupUsersWorkflow uses DiffusionLowLevelCommitQuery.
PhabricatorRepositoryGitCommitMessageParserWorker uses DiffusionLowLevelCommitQuery.
PhabricatorRepositoryMercurialCommitMessageParserWorker uses DiffusionLowLevelCommitQuery.
PhabricatorRepositorySvnCommitMessageParserWorker uses DiffusionLowLevelCommitQuery.
(DiffusionRequest has one too, but it's conditional and OK.)

To deal with these, we could either expand diffusion.querycommits to include this information, or we could introduce a new API method. Expanding diffusion.querycommits seems like a better approach, although it will be a little messy because this query is invoked very early in the parse process. We could add a bypassCache flag or something, although this doesn't feel great. But, at least tenatively, that's a plan of attack:

Add all the information needed to build DiffusionCommitRef objects to diffusion.querycommits: author name, author email, committer name, committer email, hashes.
Add a bypassCache flag to the call. When set, load the information using DiffusionLowLevelCommitQuery. When not set, load the information from the caches on Commit/CommitData.
Convert improper callers from DiffusionLowLevelCommitQuery to using new ConduitCall() to invoke diffusion.querycommits, setting the bypassCache flag as appropriate.

Improper uses of direct access:

PhabricatorRepositoryGitCommitChangeParserWorker makes direct git calls.
PhabricatorRepositoryMercurialCommitChangeParserWorker makes direct hg calls.
(DifferentialReleephRequestFieldSpecification makes direct calls, but can be ignored for now.)
(PhabricatorRepositorySvnCommitChangeParserWorker makes direct svn calls, but we do not need to fix these, since they're remote anyway.)
(The DifferentialLanding... classes make direct calls, but can be ignored.)

These need to be dealt with individually and may require the introduction of new methods. In some cases, we probably have an appropriate method already that will work as-is or with slight modifications (for example, the hg cat call can probably be replaced with diffusion.filecontentquery.

To get started overall, I'd tackle DiffusionLowLevelCommitQuery first. Specifically:

Open up DiffusionQueryCommitsConduitAPIMethod. We want to add these keys to the result dictionary (so it can be used to populate a DiffusionCommitRef):
- authorName
- authorEmail
- authorPHID
- committerName
- committerEmail
- committerPHID
- hashes
You can start by just making it return those keys with empty values. You should be able to use the "Conduit Console" from the web UI (at /conduit/) to verify that your changes have an effect.
Add a bypassCache flag. When this flag is passed to the call, it should make a call to DiffusionLowLevelCommitQuery and use those results to fill all the fields (and, if required, the message field). The web UI should show your changes having an effect and populating all this data.
When the bypassCache flag is not set, fill in the data as best you can. Some of it is available on DiffusionRepositoryCommit or DiffusionRepositoryCommitData. For now, it's not important that this work with bypassCache off.
(As a followup, we can make PhabricatorRepositoryCommitMessageParserWorker fill in the rest of the data too.)
One at a time, convert the DiffusionLowLevelCommitQuery callsites listed above to use diffusion.querycommits instead. PhabricatorRepositoryManagementLookupUsersWorkflow might be a good place to start with, since it's easy to test by running bin/repository lookup-users.

hach-que added a revision: D10399: Populate results of DiffusionQueryCommitsConduitAPIMethod with DiffusionLowLevelCommitQuery.Sep 2 2014, 8:07 AM

I attempted to start converting PhabricatorRepositoryManagementLookupUsersWorkflow over to use ConduitCall, but there's no way for it to make a call as the omnipotent user at this time, until "Support Host Identity and Authentication" from T4209 gets done. This almost certainly affects conversion of the commit parsing workers as well.

My plan of attack here is to get an absolutely bare minimum system for host identification / signing working. Hosts will register themselves in the database on startup with:

bin/almanac register

This generates a private / public key pair, storing the public key in the database, and the private key on disk under conf/local/HOSTKEY (since this needs to be located in a consistent location that other areas can access). The alternative is to have a host value stored in the local.conf and use that to lookup the private key path, but this seems like overkill.

I've managed to get PhabricatorRepositoryManagementLookupUsersWorkflow using ConduitCall by implementing host identification.

D10400, D10401 and D10402 cover the implementation of the host identification / signature verification.

D10403 covers the conversion of the workflow.

I'm going to hold off converting any more workflows until I know the host identification / signature verification is a suitable solution.

hach-que added a commit: rPd7f51325e3ab: Populate results of DiffusionQueryCommitsConduitAPIMethod with….Sep 3 2014, 12:54 PM

epriestley added a revision: D10424: Provide more information from diffusion.querycommits.Sep 5 2014, 1:41 PM

epriestley added a commit: rPac4247ea59eb: Provide more information from diffusion.querycommits.Sep 5 2014, 7:28 PM

epriestley mentioned this in T6240: Implement Conduit request signing for host-to-host calls.Oct 3 2014, 11:29 AM

btrahan moved this task from Backlog to v0 Closed Beta on the Phacility board.Nov 11 2014, 10:40 PM

epriestley added a subtask: T6240: Implement Conduit request signing for host-to-host calls.Nov 11 2014, 10:50 PM

epriestley closed subtask T6240: Implement Conduit request signing for host-to-host calls as Resolved.Dec 10 2014, 5:18 PM

epriestley added a revision: D10959: Make ConduitCall always local/in-process.Dec 10 2014, 6:16 PM

epriestley added a revision: D10960: Allow ConduitClient to specify an explicit Host.Dec 10 2014, 6:21 PM

epriestley added a commit: rPdb51d7d92a42: Make ConduitCall always local/in-process.Dec 10 2014, 11:27 PM

epriestley added a commit: rPHU34b7825ae1da: Allow ConduitClient to specify an explicit Host.

epriestley added a subtask: T5955: Refactor Conduit auth to be stateless, token-based, and support wire encodings.Dec 12 2014, 6:01 PM

epriestley mentioned this in T5955: Refactor Conduit auth to be stateless, token-based, and support wire encodings.Dec 12 2014, 6:04 PM

epriestley added a revision: D10982: Allow repositories to be bound to an AlmanacService.Dec 12 2014, 6:09 PM

epriestley added a commit: rP4505724cc4f1: Allow repositories to be bound to an AlmanacService.Dec 12 2014, 8:07 PM

epriestley mentioned this in D10985: Add Conduit Tokens to make authentication in Conduit somewhat more sane.Dec 12 2014, 9:41 PM

epriestley added a revision: D10990: Generate and use "cluster" Conduit API tokens.Dec 13 2014, 6:23 PM

epriestley mentioned this in T6706: Allow SSH keys (and other Conduit tokens) to be restricted to specific IP ranges.Dec 13 2014, 6:26 PM

epriestley removed a subtask: T5955: Refactor Conduit auth to be stateless, token-based, and support wire encodings.

epriestley mentioned this in rP39f2bbaeea1b: Add Conduit Tokens to make authentication in Conduit somewhat more sane.Dec 15 2014, 7:14 PM

epriestley added a commit: rPf18ee5c237fe: Generate and use "cluster" Conduit API tokens.

epriestley added a revision: D11001: Rename "Local" settings in Diffusion to "Storage".Dec 17 2014, 3:03 PM

epriestley added a commit: rPd8739459f6f9: Rename "Local" settings in Diffusion to "Storage".Dec 17 2014, 7:13 PM

epriestley added a revision: D11003: When repository services are available, use them when creating a new repository.Dec 17 2014, 10:18 PM

epriestley added a commit: rPcd6f67ef95a3: When repository services are available, use them when creating a new repository.Dec 18 2014, 10:31 PM

epriestley added a revision: D11043: Add IP range / CIDR utilities to libphutil.Dec 23 2014, 11:55 PM

epriestley added a commit: rPHUbbc4b860710f: Add IP range / CIDR utilities to libphutil.Dec 30 2014, 12:17 AM

epriestley added a revision: D11099: Fix diffusion.readmequery to work in a cluster enviroment.Dec 31 2014, 1:44 AM

epriestley added a revision: D11100: Don't check "repository.default-local-path" for readability in a cluster environment.Dec 31 2014, 3:45 PM

epriestley added a revision: D11102: Skip some repository checks in cluster enviornments.Dec 31 2014, 7:05 PM

epriestley added a commit: rP376729b44c88: Don't check "repository.default-local-path" for readability in a cluster….Dec 31 2014, 7:50 PM

epriestley added a commit: rP8c4f3edd8ad8: Skip some repository checks in cluster enviornments.

epriestley added a commit: rPcae8c49745cc: Fix diffusion.readmequery to work in a cluster enviroment.Dec 31 2014, 7:55 PM

epriestley added a revision: D11158: Add `bin/almanac register` to associate a host with an Almanac device and trust it.Jan 2 2015, 8:20 PM

epriestley added a revision: D11159: Add `cluster.addresses` and require membership before accepting cluster authentication tokens.Jan 2 2015, 9:13 PM

epriestley added a revision: D10403: Update PhabricatorRepositoryManagementLookupUsersWorkflow to use ConduitCall.Jan 2 2015, 9:23 PM

epriestley added a commit: rPc84b9d408cb5: Add `bin/almanac register` to associate a host with an Almanac device and trust….Jan 2 2015, 11:13 PM

epriestley added a commit: rPfa7bb8ff7a50: Add `cluster.addresses` and require membership before accepting cluster….

epriestley added a commit: rP4f4dc9c83e8a: Update PhabricatorRepositoryManagementLookupUsersWorkflow to use ConduitCall.

epriestley mentioned this in D11163: Add auth.querypublickeys to retrieve public keys.Jan 2 2015, 11:40 PM

epriestley added a revision: D11253: Remove direct calls to LowLevelCommitQuery.Jan 6 2015, 5:25 PM

epriestley moved this task from v0 Closed Beta to Do After Launch on the Phacility board.Jan 6 2015, 5:26 PM

We've fully separated the web process, which is the major issue here. Repositories no longer need to exist on web machines (given eldritch knowledge of secret, undocumented configuration).
We haven't fully separated the daemons yet, so we can't put daemons and repositories on different machines (notably, we can't have several repository machines for a single instance). I'm going to push this out until after Phacility because we don't need it until we run into an install which needs more than one host worth of daemons or repositories. The resources these workloads use are dissimilar (daemons use mostly CPU + memory, repos use mostly disk IO + network IO) and I don't anticipate running into scaling issues for a while.

As a heads up, when Harbormaster is available for Phacility customers, I highly recommend running those daemon tasks on a different tier. Long running tasks can easily queue up all available taskmaster daemons, and at work we have to run about 50 or so daemons to keep up with peak load (because of the way taskmasters work at the moment, there's no way we can shift only and all of the Harbormaster tasks onto a different set of machines).

epriestley added a subtask: T7019: Proxy HTTP VCS traffic.Jan 23 2015, 11:54 AM

epriestley added a subtask: T7020: Proxy Diffusion Conduit requests.

epriestley added a revision: D11476: Provide a fast path for resolving repository refs.Jan 23 2015, 2:35 PM

epriestley added a revision: D11477: Remove 'initFromConduit' option from Diffusion.Jan 23 2015, 2:44 PM

epriestley closed subtask T7020: Proxy Diffusion Conduit requests as Resolved.Jan 23 2015, 9:31 PM

epriestley added a commit: rPd98eb2c8b809: Provide a fast path for resolving repository refs.

epriestley added a commit: rP8d087ae738d8: Remove 'initFromConduit' option from Diffusion.

epriestley closed subtask T7019: Proxy HTTP VCS traffic as Resolved.Jan 27 2015, 10:51 PM

epriestley closed subtask Restricted Maniphest Task as Resolved.Jan 28 2015, 10:41 PM

epriestley moved this task from Do After Launch to Do Eventually on the Phacility board.Feb 4 2015, 1:31 PM

epriestley mentioned this in rPa7814b071c79: Add auth.querypublickeys to retrieve public keys.Feb 10 2015, 11:44 PM

epriestley added a commit: rPbdd7a35b3094: Remove direct calls to LowLevelCommitQuery.Feb 10 2015, 11:59 PM

epriestley mentioned this in T7346: Anticipate scaling challenges in the Phacility cluster.Feb 21 2015, 1:04 PM

hach-que added a revision: D11874: Use Conduit in PhabricatorRepositoryGitCommitChangeParserWorker.Feb 24 2015, 2:59 AM

eadler added a subscriber: eadler.Jul 22 2015, 6:21 AM

chad changed the visibility from "All Users" to "Public (No Login Required)".Jul 23 2015, 4:42 AM

devurandom added a subscriber: devurandom.Aug 19 2015, 5:51 AM

joshuaspence mentioned this in Q176: Where should I run the repository pull daemon with `--no-discovery`.Oct 15 2015, 1:58 AM

epriestley mentioned this in T4209: Multiserver / High-Availability Configuration.Dec 4 2015, 12:20 PM

michel-slm added a subscriber: michel-slm.Dec 7 2015, 12:52 PM

epriestley added a revision: D14969: Convert a low-level VCS query in Diff extraction to a Conduit call.Jan 8 2016, 1:53 PM

epriestley added a commit: rPda3963b009ce: Convert a low-level VCS query in Diff extraction to a Conduit call.Jan 8 2016, 5:28 PM

eadler added a project: Restricted Project.Feb 18 2016, 6:31 PM

eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.

eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Mar 9 2016, 10:12 PM

epriestley mentioned this in T10751: Make Phabricator Highly Available.Apr 8 2016, 6:23 PM

epriestley added a parent task: T10751: Make Phabricator Highly Available.

epriestley mentioned this in D11874: Use Conduit in PhabricatorRepositoryGitCommitChangeParserWorker.Apr 8 2016, 8:08 PM

epriestley mentioned this in T10753: Remove Mercurial daemon working copy operations.Apr 8 2016, 8:29 PM

epriestley mentioned this in T10754: Remove Subversion daemon working copy operations.Apr 8 2016, 8:33 PM

I believe D11874 covers the last of this for Git.

I've filed T10753 and T10754 as Mercurial / Subversion followups. We currently have no installs that I'm aware of with an interest in HA that use anything other than Git. Improving availability in the Phacility cluster will probably drive these use cases eventually, if specific interest doesn't appear before that.

I think the PullLocal daemon is still too dumb to figure out which repositories it should act on so this isn't hugely useful on its own unless you want to manually bin/phd launch things. I'll sort this out elsewhere (in T3145 and stuff attached to that), so that bin/phd start does the right thing on hosts by default.

epriestley added a commit: rP7150aa8e192e: Use Conduit in PhabricatorRepositoryGitCommitChangeParserWorker.Apr 14 2016, 11:53 AM

urzds added a subscriber: urzds.Jul 12 2017, 11:14 AM

Make working-copy operations service-orientedClosed, ResolvedPublicActions

Description

Revisions and Commits

Related ObjectsSearch...

Event Timeline

Make working-copy operations service-oriented
Closed, ResolvedPublic
Actions

Related Objects
Search...