Maniphest T1969

Phabricator support/awareness of multiple regions
Closed, WontfixPublic
Actions

Assigned To

Authored By

	nh
	Oct 24 2012, 6:42 PM

Description

Our svn masters (and other verion control systems) are in a different region from our database masters. We can pretty easily put the frontend and the daemons for phabricator in either (or both) of these regions, but the various different combinations we've tried all result in some sort of problem.

If we put the frontend and daemons in the same region as svn (away from the db master), we get timeouts due to slow cross-country queries. If the frontend and daemons are in the same region as the db master (and phabricator is configured to use the svn master), diffusion is unbearably slow (multiple-second page loads) due to cross-country svn reads. If we configure phabricator to use the svn replica in the same region as the db, then detection of commits lags by the replication time (about 5 min). (This is something users have complained quite a bit about.) We also tried a dns hack where phabricator was configured to use an svn url that resolved to the replica on the frontends (to avoid the ui lag) and resolved to the master on the daemons (to discover commits quickly), but this results in red error boxes in diffusion when trying to browse because you're trying to browse commits that don't yet exist on the svn replica.

It appears that any solution to solve all of these problems will involve making phabricator aware to some extent of a setup that is across multiple regions. This could be having it use a db slave for reads and only go to the db on writes, use an svn master for closing revisions and a slave for reading in diffusion (and being aware of the lag), using a post-commit svn hook to close revisions, or something else.

Does anyone have an opinion on the best way to go about solving this?

Related Objects

Mentioned In: T8238: Formally support side-band change handoff in external repositories
T11056: Send some database traffic to replicas even while the master is still alive
T10488: support https.blindly-trust-domains in phabricator
T4209: Multiserver / High-Availability Configuration
T9456: Evaluate upstream support for third-party build systems

Event Timeline

nh added subscribers: nh, epriestley, btrahan and 3 others.Oct 24 2012, 6:42 PM

Does anyone have an opinion on the best way to go about solving this?

...put all the masters in the same region? Why isn't that a possibility?

Basically because we want to keep the version control master near the dev servers, which are mostly going to land in oregon. The db masters landed in carolina because that's the center of what our web footprint will look like.

But there's a slave in Oregon (if not, reading from slaves won't help anything), right? Why can't the DB team just make the Oregon DB be the master and the Carolina DB be the slave?

Basically because they hate one-offs more than we do, and putting the Phabricator master in Oregon would make it the only master in Oregon.

Also, why is there a 5 minute replication lag on the SVN master/slave? Does it fax the changes? If not, can it be made to? It would probably be faster? Why is SVN so much slower cross-country? Is it a protocol issue (like the protocol inherently does a zillion round trips), or is the link a lossy mess?

I expect to pursue master/slave at some point which is why we've left the door open in the codebase (r and w flags on connections, e.g., and hopefully-proper internal handling of reads vs writes), but the motivation should be that we need to shed load on the master for installs with lots of public content, not that the servers are separated by 5 minute replication delays which have no technical reason to exist.

Everything here just reeks of Facebook being awful:

The replication delay should be a few seconds, not 5 minutes.
The overhead of cross-country svn commands should be ~1 round trip, not ~20.
Putting the frontends in a different region from the DB masters should increase query overhead by ~1 round trip, not cause timeouts because of slow queries.
The masters should clearly just be in the same location. This is obviously the right solution to the problem. Every other solution is crazy and basically why I quit in the first place.

If (3) is a problem with master-only, there's no reason to believe it won't remain a problem with master/slave, because we must send readers to the master after write. So you'll submit a comment in Differential, and then load the page again and get timeouts (if we sent you to the slave, you wouldn't see your comment). So even if we build this for the right reasons (preparation for future installs which need to load shed) it won't fix Facebook's problems.

I offer these solutions:

Put the frontend with the DB masters and read off the SVN slave. Tell engineers that they can complain to the DB team about the insane master/slave setup, complain to the SVN team about the insane replication lag, go join one of those teams to fix the problems, or use arc close if they aren't happy with this. You can also trigger arc close through local git pre-commit hooks, a commit hook on the master, or arc alias.
For a modest $20M per year, I will install Phabricator + SVN in EC2 and manage it for Facebook.

Just a side note: Our DB replication is quite fast. I read from slave on my devbox and write to master and commenting in Differential just works. Not that this is a solution to anything, just a note.

The svn replication is NFS snapmirror. We're working on speeding it up as well as pursuing this. The root cause of the rather highly variable lag that averages around 5 minutes seem to be giant-ass commits that take a while to replicate.

Cross-country svn commands go across ssh connections, which take a while to set up, hence why it's more than one round-trip-time delay. I have no idea why db queries are a problem.

Also, what we would really like to do is put the front-ends in every region, and then let you just hit the closest one.

Regarding the svn replication issues.

The svn repos are on NFS and use Netapp Snap-Mirror to do replication. The Netapp volume which hosts www also hosts about 15 other svn repos (svnhive, opsfiles, admin, etc).

The vast majority of the replication take 1-2 minutes when there is less than a few 100MB of changes. However, when I sat down with the storage team, I saw that there were multiple syncs each day that took between 10-45 minutes. These long syncs were due to 1-5GB changes. At this point, I don't know which repo was generating such large changes, but I'm doubting it was due to www (although I didn't check the commit logs to correlate).

One option I had was to move www to a separate netapp volume so that the other repos wouldn't impact its replication time.

I'm okay with pursuing these things in the upstream, since they aren't strictly unique to Facebook's perfect storm:

Awareness of master/slave DBs: This is already "supported" (r/w mode on connections) but not in use as far as I know (except by @vrana, apparently). Remaining pieces are probably:

"Force reads to the master until time X" cookie after establishing a write connection, unless someone has a better idea for preventing stale reads.
Ideally, identify reads-before-writes somehow and force them to the master. POST might be an approximation but probably not a perfect one. This isn't critical to get right since most writes are append-only and most objects have very low write contention.
Add some "you're doing a write on a read connection" tests in the query layer (just looking for INSERT/UPDATE/etc is probably good enough).
Expose read/write connections better in the UI (DarkConsole).
Fix any bad 'r' / 'w' settings identified by this stuff.
Always hit the master from scripts?
...unless you just want to assume that replication is always faster than an HTTP request, in which case you can just turn it on and see what happens I guess.

Tunneling SSH: If svn cat (and I guess svn diff) are slow because of setting up SSH, we could presumably tunnel it rather than building the link every time. This is already abstracted and necessarily complicated / special cased (because of the different auth methods, protocols, and VCSes) so we would do very little harm by adding configuration/awareness of tunneling.

Improve error handling in Diffusion: We probably explode more than necessary when failing to pull data out of Subversion. Any improvements here which would also benefit a more general install are obviously welcome. The obvious example is probably rendering file errors inline rather than in popups.

One possible solution to read-after-write is to execute SHOW MASTER STATUS after write, store it locally (readable fast from all web frontends) and call MASTER_POS_WAIT() before reading (which would be usually very quick or no-op in our setup).
We use write connection in transactions which should cover all significant read-before-writes.
Detecting write-on-read-connection is better done in MySQL setup by read_only. We already do it and it throws when misused.

Could it just be configured to pull the entire SVN repo local to Phabricator/Diffusion? Possibly using such hacks as git-svn?

Cobi added a subscriber: Cobi.Nov 12 2012, 5:18 PM

btrahan triaged this task as Wishlist priority.Mar 19 2013, 10:43 PM

btrahan added a project: Facebook.

All the stuff here is nonactionable, years away on our natural roadmap, and Facebook-specific. We're happy to set up a meeting about this and generate a Serious Enterprise Support Proposal, but we're unlikely to move forward on any of this otherwise.

epriestley claimed this task.Jul 19 2013, 6:54 PM

jevripio added a subscriber: jevripio.Jun 17 2014, 9:37 AM

epriestley mentioned this in T9456: Evaluate upstream support for third-party build systems.Oct 30 2015, 2:02 PM

eadler mentioned this in T4209: Multiserver / High-Availability Configuration.Jan 9 2016, 12:34 AM

epriestley mentioned this in T10488: support https.blindly-trust-domains in phabricator.Mar 1 2016, 9:39 PM

chad changed the visibility from "All Users" to "Public (No Login Required)".Mar 1 2016, 9:56 PM

epriestley mentioned this in T11056: Send some database traffic to replicas even while the master is still alive.May 30 2016, 5:53 PM

epriestley mentioned this in T8238: Formally support side-band change handoff in external repositories.Oct 13 2016, 2:10 PM

joshuaspence added a subscriber: joshuaspence.May 15 2017, 11:26 AM

Phabricator support/awareness of multiple regionsClosed, WontfixPublicActions

Description

Related Objects

Event Timeline

Phabricator support/awareness of multiple regions
Closed, WontfixPublic
Actions