Commit into repository directly from differential
Open, NormalPublic
Actions

Assigned To

Authored By

	tuomaspelkonen
	May 26 2011, 11:57 PM

Description

Just opening up a conversation about how to do this. This would be a really nice feature for deffirential.

From Facebook Tasks:

D255317 fixes (or at least should fix) the latest branch, which means we can get a working build out for testing.

Right now though, we need to get it reviewed, committed, and requested for merging in to latest.

I'd like to be able to request a diff to be merged, prior to it actually being committed. This is a valid thing to do as long as we're not using shadow branches in production (and only for testing), though I guess that might be debatable.

Revisions and Commits

rP Phabricator
	Closed		D7534 Add support for landing to hosted Mercurial repos.
	Closed		D7555 Land to GitHub + support stuff
		D7486	rP5c0edc935102 Land Revision button for hosted git repos
		D14350	rP1c7443f8f242 Make "Land Revision" button state consistent, prevent non-accepted lands
		D14349	rPa763f9510e76 Add some Drydock documentation plus "Test Configuration" for repository…
		D14348	rPcea633f698da Don't show error operations after a successful land operation
		D14347	rPbbf4ce79e341 Provide a username and email when running `git merge --squash`
		D14346	rP5a35dd233b19 Don't use --ff-only inside "Land Revision"
		D14343	rP0b24a6e2006e Make "Land Revision" show merge conflicts more clearly
		D14341	rP2326d5f8d090 Show lease on Repository Operation detail view and awaken on failures
		D14338	rPa0fba642b382 Show the oldest non-failing revision land operation, or the newest failure
		D14337	rP9c3949379617 Make WorkingCopyBlueprint responsible for performing merges
		D14314	rP5b619862cb06 Show a more reasonable status element for pull requests
		D14295	rP92a626fc1ca3 Add a basic list view for repository operations
		D14280	rPf3f3d957021b When landing revisions via repository automation, use better metadata
		D14270	rP43bee4562c72 If the stars align, make "Land Revision" kind of work
		D14266	rPb4af57ec51e1 Rough cut of DrydockRepositoryOperation
		D14259	rPdf5a031b54a5 Allow "Repository Automation" to be configured for repositories
		D14258	rP3f3626c11a04 Write some documentation about Drydock security and repository automation
		D13022	rP08a9e0f22aff Hide Land to GitHub
		D8719	rPe8e62f82ceae Hide "Land to hosted git" button for now

Related Objects
Search...

Status	Assigned	Task
Open	None	T6008 Editing files and contributing changes via web
Open	epriestley	T182 Commit into repository directly from differential
		Restricted Maniphest Task
Resolved	epriestley	T603 Support permissions/policies in all Phabricator applications
		Restricted Maniphest Task
		Restricted Maniphest Task
		Restricted Maniphest Task
Resolved	epriestley	T3820 Implement top-level "Spaces" that provide policy isolation to groups of objects
Resolved	epriestley	T8376 Develop Spaces (v1)
Resolved	epriestley	T8377 Build the core "Spaces" Application
Resolved	epriestley	T7703 Policy checks may execute incompletely for non-viewers
Resolved	epriestley	T6367 Send email with recipient's language and access levels, not sender's
Resolved	epriestley	T8387 Convert Mailing Lists into special users, similar to bot users
Resolved	epriestley	T8424 Integrate Spaces into the core policy infrastructure
Resolved	epriestley	T8441 Integrate Spaces into ApplicationSearch
Open	None	T7715 Give `arc` query workflows (arc tasks, arc list) access to ApplicationSearch
Resolved	epriestley	T8449 Spaces v1 Errata
Resolved	epriestley	T8493 Integrate Spaces into more applications
Resolved	epriestley	T8498 Integrate Spaces into shared infrastructure
		Restricted Maniphest Task
		Restricted Maniphest Task
		Restricted Maniphest Task
		Restricted Maniphest Task
Resolved	epriestley	T4041 updates on bare repos fail when git < 1.7.2
Resolved	epriestley	T4038 Implement replication of repositories to remote URIs
Resolved	epriestley	T4122 Implement "Passphrase", a credential management application
Resolved	epriestley	T9252 Unprototype Drydock (v1)
Resolved	epriestley	T9253 Implement a Drydock blueprint for leasing hosts from a static, pre-built Almanac pool
Resolved	epriestley	T9431 Deploy an "sbuild" tier outside of the cluster
Resolved	None	T9519 Design the Drydock Blueprint selection mechanism
Resolved	epriestley	T9952 Let "Land Revision" target branches other than master
Resolved	epriestley	T9984 For limit=1 tokenizers, allow browse but replace the value

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

General progress update here: this now works, technically. Almost every part of it -- particularly, the UI -- is still total garbage for now, but if you ship a revision against rGITTEST up to this server you'll get a "Land Revision" button on it which has worked twice in the past and might work again. I would recommend extending the list of pasta varieties found in the pasta file if you want to try this (the button should appear and function without requiring the revision to be accepted for now, and everyone has write access to the repo anyway).

Although the UI is a mess, the underlying fundamentals are scalable: because this process leverages Drydock, it could perform two (maybe three!) pulls at the same time if you had enough hardware.

This stuff is done or mostly done or otherwise in good shape:

Blueprint selection needs a little more UI work but feels like it's headed in a good direction and generally in a good place already. T9519 has details.
Diffusion now has a "Repository Automation" section which allows blueprint selection. This seems fine for v1.
Merge requests exist as objects (DrydockRepositoryOperation) and about half the support code is in reasonable shape.
Drydock is generally holding up and beginning to feel like an actual piece of useful infrastructure.
And the thing does technically work in a reasonable subset of cases today.

This stuff needs significant additional work:

DrydockRepositoryOperation is still missing some fundamental interfaces (like a list view), and some major concepts (like making sure patches apply sequentially and probably a command queue) are undeveloped or undeveloped.
UI is a huge pile of trash, needs a bunch of work. Particularly, feedback about progress and outcomes is exceptionally poor.
The actual merge mechanics are OK, but have three major remaining areas which require work:
- Mechanics live in DrydockRepositoryOperation, but probably need to live in WorkingCopyBlueprint (or otherwise be accessible from both places). The code could move easily in a purely technical sense, but picking up merge errors from WorkingCopyBlueprint is less easy.
- We aren't trying to deal with any of the hard cases yet (like revisions not based on the target branch). Worse, we will happily do the wrong thing, which is dangerous.
- Process is not trusted, and I think getting a chain of custody which covers "what you see is what will land" will be hard in the general case.
Lots of hard coding of things like "master" being the target branch.
General lack of configuration and workflow/application logic (e.g., no way to only allow accepted revisions to land).

turadg awarded a token.Oct 14 2015, 8:34 PM

epriestley added a revision: D14295: Add a basic list view for repository operations.Oct 16 2015, 9:33 PM

epriestley added a commit: rP92a626fc1ca3: Add a basic list view for repository operations.Oct 17 2015, 1:47 AM

• austinstoker added a subscriber: • austinstoker.Oct 17 2015, 7:43 AM

• austinstoker awarded a token.Oct 17 2015, 7:51 AM

epriestley mentioned this in Blog Post: Development Notes (2015 Week 42).Oct 17 2015, 1:46 PM

epriestley added a revision: D14314: Show a more reasonable status element for pull requests.Oct 20 2015, 11:31 PM

epriestley added a commit: rP5b619862cb06: Show a more reasonable status element for pull requests.Oct 21 2015, 6:28 PM

We had a very slow week and haven't made much progress here. There is a new list view and some new, more reasonable UI. We landed our first production change with this mechanism.

The next pieces don't have obvious or unambiguous paths forward yet:

We have a hardcoded one-resource-per-blueprint limit we need to lift, but doing this in a way that is too simple may make things harder once the mechanism expands in the future.
We need to store results from operations so we can tell the user what happened (or what failed) but don't have a great place to put them right now, and underarchitecting this may also make things harder in the future.

I think these are probably both straightforward once I figure out how to proceed, they just need some thought about what the medium and long terms versions of these features will look like.

epriestley added a revision: D14337: Make WorkingCopyBlueprint responsible for performing merges.Oct 25 2015, 11:08 PM

epriestley added a revision: D14338: Show the oldest non-failing revision land operation, or the newest failure.Oct 25 2015, 11:23 PM

epriestley added a revision: D14341: Show lease on Repository Operation detail view and awaken on failures.Oct 26 2015, 5:21 PM

epriestley added a revision: D14343: Make "Land Revision" show merge conflicts more clearly.Oct 26 2015, 6:23 PM

epriestley added a commit: rP9c3949379617: Make WorkingCopyBlueprint responsible for performing merges.Oct 26 2015, 7:40 PM

epriestley added a commit: rPa0fba642b382: Show the oldest non-failing revision land operation, or the newest failure.Oct 26 2015, 7:58 PM

epriestley added a commit: rP2326d5f8d090: Show lease on Repository Operation detail view and awaken on failures.Oct 26 2015, 8:00 PM

epriestley added a commit: rP0b24a6e2006e: Make "Land Revision" show merge conflicts more clearly.Oct 26 2015, 8:11 PM

epriestley added a revision: D14346: Don't use --ff-only inside "Land Revision".Oct 26 2015, 8:21 PM

epriestley added a commit: rP5a35dd233b19: Don't use --ff-only inside "Land Revision".Oct 26 2015, 8:26 PM

webframp added a subscriber: webframp.Oct 26 2015, 8:51 PM

epriestley mentioned this in T3277: `arc cleanup` to cleanup branches used by `arc patch`.Oct 26 2015, 8:54 PM

epriestley added a revision: D14347: Provide a username and email when running `git merge --squash`.Oct 26 2015, 9:09 PM

epriestley added a commit: rPbbf4ce79e341: Provide a username and email when running `git merge --squash`.Oct 26 2015, 9:12 PM

epriestley added a revision: D14348: Don't show error operations after a successful land operation.Oct 26 2015, 9:24 PM

epriestley added a commit: rPcea633f698da: Don't show error operations after a successful land operation.Oct 26 2015, 9:27 PM

I think we should disable the "Land Revision" action item if the revision has already been landed.

Screenshot_20151027-192254.png (1×1 px, 227 KB)

The action is always available right now because it makes testing (particularly, testing things like "does it do the right thing when the revision is already landed") easier, but I'm going to require "Accepted" for v0 of this, which will indirectly disable it for landed ("Closed") revisions. We'll probably relax that later but it seems like the right place to start from, at least.

epriestley added a revision: D14349: Add some Drydock documentation plus "Test Configuration" for repository automation.Oct 27 2015, 5:08 PM

epriestley added a revision: D14350: Make "Land Revision" button state consistent, prevent non-accepted lands.Oct 27 2015, 5:25 PM

epriestley added a commit: rPa763f9510e76: Add some Drydock documentation plus "Test Configuration" for repository….Oct 27 2015, 6:04 PM

chad added a commit: rP1c7443f8f242: Make "Land Revision" button state consistent, prevent non-accepted lands.Oct 27 2015, 6:52 PM

This is in reasonably functional shape for ambitious users. There's some documentation now:

We've landed most of the recent upstream changes with this mechanism. These specific things have been improved:

DrydockRepositoryOperation UI is in fairly reasonable shape now and provides critical progress feedback through the use of animation.
Merge mechanics are in WorkingCopy now.
Merge mechanics are less bad, but still a bit bad.
Limits and error handling are in better shape.
Only accepted revisions can land, and some other workflow sanity checks now exist.

This stuff remains:

Hard-coded target of master.
Hard cases (revisions with extra history / need rebase) aren't handled yet and may do the wrong thing.
Workflow does not have a chain of custody yet, and what lands can still be different from what was reviewed.
Probably plenty of other rough edges that we'll hit over time.

If you try setting this up, I'm interested in feedback about problems you run into or which missing features are the most important for your use case.

epriestley closed subtask T9252: Unprototype Drydock (v1) as Resolved.Oct 27 2015, 9:54 PM

epriestley mentioned this in T4333: `arc land` should provide a mechanism to land revisions by other authors while preserving authorship meta-data.Oct 28 2015, 2:26 AM

BYK added a subscriber: BYK.Oct 28 2015, 1:19 PM

Because this works from the staging repository, it will be possible to trick it right? I mean, I could do this to sneakily commit virus.exe to the repository:

git add virus.exe
git commit -a
# Make a legitimate change
git commit -a
arc diff HEAD~

Yes, that's what this refers to:

Workflow does not have a chain of custody yet, and what lands can still be different from what was reviewed.

Another way to do it is:

Comment out the staging push in arc land.
arc diff normally.
Push arbitrary, wholly different changes to the staging area.

I can't remember which ticket, but I remember a similar discussion regarding arc patch (whether arc patch should patch onto HEAD or pull from the staging area).

Do you have any ideas on how to improve this issue?

I think the discussion is in T8090.

There are two cases here:

The changes to be landed are on top of other changes (as in your example, where virus.exe is snuck in underneath).
The changes to be landed are wholly different from the changes to be reviewed (as in my example, where a malicious user sent a good change for review and pushed virus.exe for merge).

For (1), we already know exactly which commit hashes we expect to land, so the merging work in the LandWorkflow will just get stricter and reject merges which would bring in other commits.

This will create a mild usability issue: if you diff A and then diff B on top of it, and then land A, it would be nice to be able to land B without rebasing it first, but you won't be able to in a mutable repository because the version of A on B will differ from the version of A on master.

For now, you'll just have to rebase. Later, we can refine this by keeping track of prior lands, so when we land commit X (diff A) as commit Q, and then see commit X in the history of B, we can recognize that it's equivalent to Q and just remove both commits from the histories before we check if the merge is exact. This will take some work but we don't need to get it right 100% of the time, since there will always be some cases where we hit conflicts and users legitimately need to rebase anyway.

For (2), I think we need to read the diff out of the staging area instead of having arc submit it. This is going to be pretty messy, but I think it's doable. I'm primarily worried that the client wait may be unreasonably long (I don't see a way to avoid doing a blocking, synchronous read of the pushed commit from arc), but hopefully we can get that down into an acceptable range.

In the long run, with hosted repositories and more flexibility around virtualizing refs (like T5000 and T8092) we can do much better here, and we can do something reasonable today for any hosted repository by generating a diff as a side effect in the commit hook if the push touches a diff tag, but I don't see a super clean way to do this with tracked, external staging areas.

The not-so-clean flow, particularly, is:

Push to staging area.
Call a new differential.createdifffromstagingarea() Conduit method.
This method blocks until the diff is read.

Internally, it has to do something like start a Drydock operation to read the diff, then wait until the operation completes.

One possible alternative is for the import pipeline to start recognizing "magic" refs and doing things as a side effect of import, and to put similar logic on the hook pipeline for hosted repositories. They can theoretically share some code. The workflow as a whole would look the same, except arc would be responsible for polling Conduit to wait until the diff shows up.

avivey mentioned this in T9676: Triangle workflows (pushing to a merge manager).Oct 31 2015, 3:37 AM

epriestley mentioned this in Blog Post: Development Notes (2015 Week 44).Nov 2 2015, 5:28 PM

All the policy, merge strategy, etc., stuff is hard-coded to the easiest set of reasonable defaults. Some of these defaults may not be suitable for many installs (e.g., always rebase, always merge to master, require edit capability for imported repositories in order to perform merges). I expect to expand these shortly, I just don't want to focus on making things more flexible until they actually work.

Are you planning a fully specifiable merge command so we can run anything we want for merging?

We make heavy use of submodules at my work, such that every change is part of a submodule change. git refuses to merge a super-project if a submodule changed in both branches, even if the changes to the submodule would merge cleanly. We have extended git with a custom command so running git rmerge <branch> from the super project will go through submodules that changed in both branches and merge each of them. If we can fully specify the merge command itself then we can extend git on the phabricator install with our custom rmerge command and invoke it. I realize we could dig into the phabricator source and change anything we want but I'd prefer to stick with the upstream version if possible.

Are you planning a fully specifiable merge command so we can run anything we want for merging?

There will be more flexibility in the future, but I don't anticipate making the merge command configurable, per se (principally, the merge is more complex than a single command, and will become increasingly complex in the future).

You can create a new blueprint which provides WorkingCopy resources and merges in a different way, then use that blueprint type instead of the vanilla WorkingCopy blueprint. However, this will probably be a lot of work to maintain today. Once blueprints stabilize more it may be more realistic.

@epriestley I'll keep an eye on blueprints and the merge configurability. Thank you. Phabricator is awesome/and getting awesomer.

DheerendraRathor added a subscriber: DheerendraRathor.Nov 11 2015, 2:26 PM

epriestley reopened subtask T9952: Let "Land Revision" target branches other than master as Open.Dec 10 2015, 8:45 PM

epriestley closed subtask T9952: Let "Land Revision" target branches other than master as Resolved.Dec 10 2015, 11:38 PM

epriestley mentioned this in Blog Post: Development Notes (2015 Week 50).Dec 12 2015, 6:33 PM

epriestley mentioned this in Q263: How to prevent sneaky changes pushed after review is approved?.Jan 8 2016, 12:41 AM

borna.emami added a subscriber: borna.emami.Jan 9 2016, 2:53 AM

chad mentioned this in D9599: [Draft] Make pushes to magical branches create diffs.Jan 12 2016, 5:11 AM

Treri added a subscriber: Treri.Jan 12 2016, 5:57 AM

• jneumann added a subscriber: • jneumann.Jan 16 2016, 1:32 AM

epriestley moved this task from The Queue to Paused on the Prioritized board.Feb 7 2016, 4:16 AM

amartin33 awarded a token.Feb 19 2016, 8:20 PM