Page MenuHomePhabricator

Assign RepositoryIdentity objects to commits
ClosedPublic

Authored by amckinley on May 10 2018, 1:45 AM.
Tags
None
Referenced Files
F14075012: D19443.id46509.diff
Thu, Nov 21, 9:32 AM
Unknown Object (File)
Wed, Nov 20, 6:44 PM
Unknown Object (File)
Wed, Nov 20, 4:35 PM
Unknown Object (File)
Sat, Nov 16, 1:25 PM
Unknown Object (File)
Tue, Nov 12, 1:24 PM
Unknown Object (File)
Tue, Nov 12, 6:48 AM
Unknown Object (File)
Fri, Nov 8, 9:00 AM
Unknown Object (File)
Sun, Oct 27, 12:14 PM
Subscribers
Restricted Owners Package

Details

Summary

Depends on D19429. Depends on D19423. Ref T12164. This creates new columns authorIdentityPHID and committerIdentityPHID on commit objects and starts populating them. Also adds the ability to explicitly set an Identity's assignee to "unassigned()" to null out an incorrect auto-assign. Adds more search functionality to identities. Also creates a daemon task for handling users adding new email address and attempts to associate unclaimed identities.

Test Plan

Imported some repos, watched new columns get populated. Added a new email address for a previous commit, saw daemon job run and assign the identity to the new user. Searched for identities in various and sundry ways.

Event Timeline

Owners added a subscriber: Restricted Owners Package.May 10 2018, 1:45 AM
resources/sql/autopatches/20180509.repo_identity.commits.sql
2–4

I have mixed feelings about DEFAULT NULL. Right now it's required to get the unit tests to pass, because PhabricatorRepositoryDiscoveryEngine doesn't set these fields during discovery (I guess PhabricatorRepositoryCommitMessageParserWorker->updateCommitData() runs in a later phase?)

Aside from that, I was developing with NOT NULL and it works fine, since updateCommitData() will definitely have a handle to the identity.

src/applications/repository/worker/commitmessageparser/PhabricatorRepositoryCommitMessageParserWorker.php
112

Not sure if we need to put this in the metadata too, but it seemed reasonable.

whoops, wrong base revision

For committer (vs author), I think DEFAULT NULL is definitely right because Mercurial and Subversion commits don't have a committer as separate from the author, so the only sensible value for this field is null.

For author, the discovery/parse phases are basically:

  • Discovery: create objects as quickly as possible. Can not be parallelized.
  • Various parsing phases: can be parallelized. Actually parse the commit.

Discovery creates a bare-bones placeholder because it must run serially, so it's theoretically the slow part if you have enough daemons. If we don't run it serially, we'd need to do a lot of work to make sure that one or more workers exiting at some random time can't leave any "holes" in the repository where some commit is forgotten about. The current discovery algorithm resists disruption, lightning strikes, etc., and converges toward getting exactly one Commit object into the database for every commit in the repository.

In practice, Discovery is currently very fast and parsing is relatively slower, and we're rarely actually blocked on discovery. We could reasonably do more work in discovery, and pulling identities might make sense. We do already pull epoch during discovery, even though it isn't strictly required, but it's "almost free" to do in every VCS.

But, at least for now, I think DEFAULT NULL ends up simpler and it's sort of nice to have it be consistent with comitterIdentity, so I'd just keep it like this rather than trying to do more work in discovery. The UI is generally good about identifying these commits as "still importing" / "parsing and incomplete", etc., so users shouldn't be surprised by the behavior the null creates.

src/applications/diffusion/controller/DiffusionIdentityViewController.php
127

I think there is, but it's super tricky. Something like you instantiate a copy of the Datasource, then set the value, render wire tokens from it, rebuild handles out of those wire tokens, then render tags out of the handles? There's a comment somewhere like "make this easier". The approach you've taken here is quite reasonable.

129

Just for UI consistency, do something like:

phutil_tag('em', array(), pht('Explicitly Unassigned'));
src/applications/diffusion/typeahead/DiffusionIdentityAssigneeDatasource.php
17

viewer() probably doesn't work well here. :)

src/applications/people/editor/PhabricatorUserEditor.php
426

Ideally, pass the user PHID as the objectPHID in the third parameter too -- I think like this:

array(
  'objectPHID' => $user->getPHID(),
)

That makes it a little easier to hunt down tasks by querying the table in some unusual cases, since we can find tasks "about object X" by querying on this column, but it tends to be a mess to try to find tasks by LIKE'ing the JSON blob.

src/applications/repository/worker/commitmessageparser/PhabricatorRepositoryCommitMessageParserWorker.php
112

I think it won't be necessary, since I believe we always have a Commit whenever we have CommitData. But it also can't really hurt anything.

src/applications/repository/xaction/PhabricatorRepositoryIdentityAssignTransaction.php
58

I think you can just do if ($new === ThingeySource::FUNCTION) { continue; } a little above.

This revision is now accepted and ready to land.May 10 2018, 3:34 PM

Requested changes, fix formatting for transactions involving the "Explicitly Unassigned" state.

This revision was automatically updated to reflect the committed changes.