Commit into repository directly from differential
Open, NormalPublic

Subscribers
Tokens
"Like" token, awarded by amartin33."Love" token, awarded by austinstoker."Orange Medal" token, awarded by turadg."Like" token, awarded by tycho.tatitscheff."Like" token, awarded by halfsea."Like" token, awarded by hwinkel."Like" token, awarded by igorgatis.
Assigned To
Authored By
tuomaspelkonen, May 26 2011

Description

Just opening up a conversation about how to do this. This would be a really nice feature for deffirential.

From Facebook Tasks:

D255317 fixes (or at least should fix) the latest branch, which means we can get a working build out for testing.

Right now though, we need to get it reviewed, committed, and requested for merging in to latest.

I'd like to be able to request a diff to be merged, prior to it actually being committed. This is a valid thing to do as long as we're not using shadow branches in production (and only for testing), though I guess that might be debatable.

Details

Differential Revisions
D7534: Add support for landing to hosted Mercurial repos.
D7555: Land to GitHub + support stuff
D7486: Land Revision button for hosted git repos
Commits
D14350 / rP1c7443f8f242: Make "Land Revision" button state consistent, prevent non-accepted lands
D14349 / rPa763f9510e76: Add some Drydock documentation plus "Test Configuration" for repository…
D14348 / rPcea633f698da: Don't show error operations after a successful land operation
D14347 / rPbbf4ce79e341: Provide a username and email when running `git merge --squash`
D14346 / rP5a35dd233b19: Don't use --ff-only inside "Land Revision"
D14343 / rP0b24a6e2006e: Make "Land Revision" show merge conflicts more clearly
D14341 / rP2326d5f8d090: Show lease on Repository Operation detail view and awaken on failures
D14338 / rPa0fba642b382: Show the oldest non-failing revision land operation, or the newest failure
D14337 / rP9c3949379617: Make WorkingCopyBlueprint responsible for performing merges
D14314 / rP5b619862cb06: Show a more reasonable status element for pull requests
D14295 / rP92a626fc1ca3: Add a basic list view for repository operations
D14280 / rPf3f3d957021b: When landing revisions via repository automation, use better metadata
D14270 / rP43bee4562c72: If the stars align, make "Land Revision" kind of work
D14266 / rPb4af57ec51e1: Rough cut of DrydockRepositoryOperation
D14259 / rPdf5a031b54a5: Allow "Repository Automation" to be configured for repositories
D14258 / rP3f3626c11a04: Write some documentation about Drydock security and repository automation
D13022 / rP08a9e0f22aff: Hide Land to GitHub
D8719 / rPe8e62f82ceae: Hide "Land to hosted git" button for now

Related Objects

StatusAssignedTask
OpenNone
Openepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
Openepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
OpenNone
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
ResolvedNone
Resolvedepriestley
Resolvedepriestley
There are a very large number of changes, so older changes are hidden. Show Older Changes
qgil removed a subscriber: qgil.Oct 7 2015, 3:21 PM
epriestley moved this task from Preflight to The Queue on the Prioritized board.Oct 10 2015, 1:30 PM

Blueprint selection (which got more details in T9519) is now in a rough-but-functional state so I expect the rest of this to move forward shortly.

General progress update here: this now works, technically. Almost every part of it -- particularly, the UI -- is still total garbage for now, but if you ship a revision against rGITTEST up to this server you'll get a "Land Revision" button on it which has worked twice in the past and might work again. I would recommend extending the list of pasta varieties found in the pasta file if you want to try this (the button should appear and function without requiring the revision to be accepted for now, and everyone has write access to the repo anyway).

Although the UI is a mess, the underlying fundamentals are scalable: because this process leverages Drydock, it could perform two (maybe three!) pulls at the same time if you had enough hardware.

This stuff is done or mostly done or otherwise in good shape:

  • Blueprint selection needs a little more UI work but feels like it's headed in a good direction and generally in a good place already. T9519 has details.
  • Diffusion now has a "Repository Automation" section which allows blueprint selection. This seems fine for v1.
  • Merge requests exist as objects (DrydockRepositoryOperation) and about half the support code is in reasonable shape.
  • Drydock is generally holding up and beginning to feel like an actual piece of useful infrastructure.
  • And the thing does technically work in a reasonable subset of cases today.

This stuff needs significant additional work:

  • DrydockRepositoryOperation is still missing some fundamental interfaces (like a list view), and some major concepts (like making sure patches apply sequentially and probably a command queue) are undeveloped or undeveloped.
  • UI is a huge pile of trash, needs a bunch of work. Particularly, feedback about progress and outcomes is exceptionally poor.
  • The actual merge mechanics are OK, but have three major remaining areas which require work:
    • Mechanics live in DrydockRepositoryOperation, but probably need to live in WorkingCopyBlueprint (or otherwise be accessible from both places). The code could move easily in a purely technical sense, but picking up merge errors from WorkingCopyBlueprint is less easy.
    • We aren't trying to deal with any of the hard cases yet (like revisions not based on the target branch). Worse, we will happily do the wrong thing, which is dangerous.
    • Process is not trusted, and I think getting a chain of custody which covers "what you see is what will land" will be hard in the general case.
  • Lots of hard coding of things like "master" being the target branch.
  • General lack of configuration and workflow/application logic (e.g., no way to only allow accepted revisions to land).

We had a very slow week and haven't made much progress here. There is a new list view and some new, more reasonable UI. We landed our first production change with this mechanism.

The next pieces don't have obvious or unambiguous paths forward yet:

  • We have a hardcoded one-resource-per-blueprint limit we need to lift, but doing this in a way that is too simple may make things harder once the mechanism expands in the future.
  • We need to store results from operations so we can tell the user what happened (or what failed) but don't have a great place to put them right now, and underarchitecting this may also make things harder in the future.

I think these are probably both straightforward once I figure out how to proceed, they just need some thought about what the medium and long terms versions of these features will look like.

I think we should disable the "Land Revision" action item if the revision has already been landed.

The action is always available right now because it makes testing (particularly, testing things like "does it do the right thing when the revision is already landed") easier, but I'm going to require "Accepted" for v0 of this, which will indirectly disable it for landed ("Closed") revisions. We'll probably relax that later but it seems like the right place to start from, at least.

This is in reasonably functional shape for ambitious users. There's some documentation now:

We've landed most of the recent upstream changes with this mechanism. These specific things have been improved:

  • DrydockRepositoryOperation UI is in fairly reasonable shape now and provides critical progress feedback through the use of animation.
  • Merge mechanics are in WorkingCopy now.
  • Merge mechanics are less bad, but still a bit bad.
  • Limits and error handling are in better shape.
  • Only accepted revisions can land, and some other workflow sanity checks now exist.

This stuff remains:

  • Hard-coded target of master.
  • Hard cases (revisions with extra history / need rebase) aren't handled yet and may do the wrong thing.
  • Workflow does not have a chain of custody yet, and what lands can still be different from what was reviewed.
  • Probably plenty of other rough edges that we'll hit over time.

If you try setting this up, I'm interested in feedback about problems you run into or which missing features are the most important for your use case.

BYK added a subscriber: BYK.Oct 28 2015, 1:19 PM

Because this works from the staging repository, it will be possible to trick it right? I mean, I could do this to sneakily commit virus.exe to the repository:

git add virus.exe
git commit -a
# Make a legitimate change
git commit -a
arc diff HEAD~

Yes, that's what this refers to:

Workflow does not have a chain of custody yet, and what lands can still be different from what was reviewed.

Another way to do it is:

  • Comment out the staging push in arc land.
  • arc diff normally.
  • Push arbitrary, wholly different changes to the staging area.

I can't remember which ticket, but I remember a similar discussion regarding arc patch (whether arc patch should patch onto HEAD or pull from the staging area).

Do you have any ideas on how to improve this issue?

I think the discussion is in T8090.

There are two cases here:

  1. The changes to be landed are on top of other changes (as in your example, where virus.exe is snuck in underneath).
  2. The changes to be landed are wholly different from the changes to be reviewed (as in my example, where a malicious user sent a good change for review and pushed virus.exe for merge).

For (1), we already know exactly which commit hashes we expect to land, so the merging work in the LandWorkflow will just get stricter and reject merges which would bring in other commits.

This will create a mild usability issue: if you diff A and then diff B on top of it, and then land A, it would be nice to be able to land B without rebasing it first, but you won't be able to in a mutable repository because the version of A on B will differ from the version of A on master.

For now, you'll just have to rebase. Later, we can refine this by keeping track of prior lands, so when we land commit X (diff A) as commit Q, and then see commit X in the history of B, we can recognize that it's equivalent to Q and just remove both commits from the histories before we check if the merge is exact. This will take some work but we don't need to get it right 100% of the time, since there will always be some cases where we hit conflicts and users legitimately need to rebase anyway.

For (2), I think we need to read the diff out of the staging area instead of having arc submit it. This is going to be pretty messy, but I think it's doable. I'm primarily worried that the client wait may be unreasonably long (I don't see a way to avoid doing a blocking, synchronous read of the pushed commit from arc), but hopefully we can get that down into an acceptable range.

In the long run, with hosted repositories and more flexibility around virtualizing refs (like T5000 and T8092) we can do much better here, and we can do something reasonable today for any hosted repository by generating a diff as a side effect in the commit hook if the push touches a diff tag, but I don't see a super clean way to do this with tracked, external staging areas.

The not-so-clean flow, particularly, is:

  • Push to staging area.
  • Call a new differential.createdifffromstagingarea() Conduit method.
  • This method blocks until the diff is read.

Internally, it has to do something like start a Drydock operation to read the diff, then wait until the operation completes.

One possible alternative is for the import pipeline to start recognizing "magic" refs and doing things as a side effect of import, and to put similar logic on the hook pipeline for hosted repositories. They can theoretically share some code. The workflow as a whole would look the same, except arc would be responsible for polling Conduit to wait until the diff shows up.

All the policy, merge strategy, etc., stuff is hard-coded to the easiest set of reasonable defaults. Some of these defaults may not be suitable for many installs (e.g., always rebase, always merge to master, require edit capability for imported repositories in order to perform merges). I expect to expand these shortly, I just don't want to focus on making things more flexible until they actually work.

Are you planning a fully specifiable merge command so we can run anything we want for merging?

We make heavy use of submodules at my work, such that every change is part of a submodule change. git refuses to merge a super-project if a submodule changed in both branches, even if the changes to the submodule would merge cleanly. We have extended git with a custom command so running git rmerge <branch> from the super project will go through submodules that changed in both branches and merge each of them. If we can fully specify the merge command itself then we can extend git on the phabricator install with our custom rmerge command and invoke it. I realize we could dig into the phabricator source and change anything we want but I'd prefer to stick with the upstream version if possible.

Are you planning a fully specifiable merge command so we can run anything we want for merging?

There will be more flexibility in the future, but I don't anticipate making the merge command configurable, per se (principally, the merge is more complex than a single command, and will become increasingly complex in the future).

You can create a new blueprint which provides WorkingCopy resources and merges in a different way, then use that blueprint type instead of the vanilla WorkingCopy blueprint. However, this will probably be a lot of work to maintain today. Once blueprints stabilize more it may be more realistic.

@epriestley I'll keep an eye on blueprints and the merge configurability. Thank you. Phabricator is awesome/and getting awesomer.

Treri added a subscriber: Treri.Jan 12 2016, 5:57 AM
epriestley moved this task from The Queue to Paused on the Prioritized board.Feb 7 2016, 4:16 AM
eadler added a project: Restricted Project.Feb 23 2016, 7:14 AM
eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Feb 23 2016, 7:18 AM
thoughtpolice moved this task from Backlog to Details on the Haskell.org board.
kaya added a subscriber: kaya.May 27 2016, 11:18 PM

Just bookkeeping, this doesn't currently have external priority.

eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jul 4 2016, 9:17 PM
urzds added a subscriber: urzds.Mon, Nov 28, 12:44 PM