Page MenuHomePhabricator

make repo callsigns optional

Authored by fabe on Dec 29 2014, 3:32 PM.
Referenced Files
Unknown Object (File)
Thu, Jun 30, 7:45 PM
Unknown Object (File)
Wed, Jun 29, 12:58 PM
Unknown Object (File)
Mon, Jun 27, 8:11 AM
Unknown Object (File)
Sat, Jun 25, 6:40 PM
Unknown Object (File)
Sat, Jun 25, 7:55 AM
Unknown Object (File)
Sat, Jun 25, 7:55 AM
Unknown Object (File)
Sat, Jun 25, 7:55 AM
Unknown Object (File)
Sat, Jun 25, 7:55 AM


Group Reviewers
Blessed Reviewers
Maniphest Tasks
T4245: Make repository URIs slightly prettier
Restricted Diffusion Commit
rPf33e2de09286: make repo callsigns optional

Ref T4245 Make repo callsigns optional
This is far from done and still very ugly. I'm just submitting it to check if i'm solving this in the right places.
Right now there's three places with duplicate code and building the identifierMap in the CommitQuery is very ugly.
If we only want to support this in the user frontend then i could hack it into the Markup rule itself and not touch the CommitQuery. Even uglier but more limited in scope...

Generally this approach will need a lot of "check this first and then try the other" in a few places.
I could move the Repository queries into a specialised PhabricatorRepositoryQuery method (withCallsignOrID) but i'm not sure about that.

Test Plan
  • phid.lookup works with R1 and rTEST (which is the same repo)
  • R1 and rTEST euqally work in remarkup (tested in comments).
  • Reviewed the following syntax also all works:

rJX Javelin

Diff Detail

rP Phabricator
Lint Not Applicable
Tests Not Applicable

Event Timeline

fabe retitled this revision from to make repo callsigns optional.
fabe updated this object.
fabe edited the test plan for this revision. (Show Details)
epriestley added a reviewer: epriestley.

Yep, this is on the right track.

I could move the Repository queries into a specialised PhabricatorRepositoryQuery method (withCallsignOrID) but i'm not sure about that.

Yeah, I think this is a good idea. It would have the added advantage of letting us issue one query instead of two in the cases where there's both a callsign and an ID (these will probably be very rare, but maybe not).

I wouldn't worry too much about deduplicating the regexps. If there are cases where it's easy, go for it, but at least with other rules they tend to be matching slightly different things or matching in slightly different contexts, and we sometimes end up needing slightly different rules (for example, IIRC #project. is a valid project hashtag, but if you type #project. in a comment we assume you meant #project + ., because 99.9% of the time that's what users meant, so the two places use slightly different regexps. T6818 has a similar case where we'll adjust the regexp used in one context but not in the other).

For the projects stuff, at least, just getting test coverage (here, in DiffusionCommitRemarkupRuleTestCase) has worked better than trying to make sure we never duplicate regexps.


We should check for if ($callsigns) and if ($ids) before issuing these queries.


Consider ctype_digit() instead of is_numeric because values like "0.0", "1e5", and "-3" satisfy is_numeric(). (I think it doesn't matter in this case, but it's nice to do the stricter check when we have stricter expectations about the input.)

This revision now requires changes to proceed.Dec 29 2014, 4:03 PM
fabe edited edge metadata.

first stage of cleanup. PhabricatorRepositoryQuery now builds an identifierMap (just like the DiffusionCommitQuery supports)
and also allows searching for multiple different kinds of identifiers at the same time (ids, phid, callsigns)
building the identifierMap in CommitQuery is still a mess and work in progress

fabe edited edge metadata.

api actually was that way all along

clean up CommitQuery identifierMap generation.
I tried to refactor it completely but was unable to achieve O(n) in any other way as it is.

found out about non capturing syntax for regexes and fixed the unit test :)

added a few more test cases for the new repos without callsigns

epriestley edited edge metadata.

Some inlines, this is looking really good.

One thought is that ideally we shouldn't recognize r123, because this is commonly used in Subversion to refer to a specific commit, and I worry it will be confusing for subversion users if we treat it as a repository reference. We can recognize R123 instead. I think this is just a matter of tweaking the regular expressions, though.


Prefer $commit_identifier, for consistency.

This is actually a bug with lint, which should have caught this automatically. D11087 should have a fix for it.


On the r vs R thing, this would just become something like /^(?:r[A-Z]+|R[0-9]+)$/, I think.


It looks like removing this code might incorrectly remove the "r" from the result map?


I think we should either disallow this "r" stripping, or preserve it in getIdentifierMap(). That is, if you make a call like:


...and the query finds a matching result, the key r1 (i.e., the exact key you passed in) should always be present in the identifier map. Otherwise you have to trim the "r" off yourself to figure out which result matches, and you might as well do that in the first place.


You can use this to identify repository PHIDs:

$repository_type = PhabricatorRepositoryRepositoryPHIDType::TYPECONST;
if (phid_get_type($identifier) === $repository_type) {
  // ...

(It's not perfect, but should always behave in reasonable/expected ways.)


I think it might be cleaner to not reuse the ids and phids property to construct this clause. One issue with the current approach is that this:

  ->withIdentifiers(array('1', 'X'))
  ->withPHIDs(array('the-PHID-of-repository-1', 'the-phid-of-repository-Q'))

...will incorrectly select repositories r1, rX, and rQ, when it should select only repository r1.

You could just define separate properties like:

public function withIdentifiers(array $identifiers) {
  // ... split them up ...
  $this->numericIdentifiers = $numbery_ones;
  $this->callsignIdentifiers = $callsigney_ones;

...and then do this:

if ($this->numericIdentifiers || $this->callsignIdentifiers) {
  $identifier_clause = array();
  if ($this->numericIdentifiers) {
    $identifier_cause[] = qsprintf(...);
  if ($this->callsignIdentifiers) {
    $identifier_clause[] = qsprinf(...);
  $where[] = '('.implode(') OR (', $identifier_clause).')';

Then withIdentifiers() will interact as expected with calls separate calls to withIDs(), withPHIDs(), and withCallsigns().

This revision now requires changes to proceed.Dec 30 2014, 3:21 PM
fabe edited edge metadata.

PhabricatorRepositoryQuery now disallows leading 'r' and classes using it have to strip / attach it themselves. Preserving it in the identifierMap leads to much worse problems.
Refactored the Query to use it's own private variables and added phid support.
I'll change the r1 rTEST vs. R1 rTEST next

sry, i'll have to do this tomorrow. Somehow the RepositoryRemarkupRule doesen't allow me not to specify a word prefix (just like in the commit Remarkup rule)...

fabe edited edge metadata.

now only allow R123 R123:commit rTEST rTEST:1234
lowercase r for callsigns and uppercase for IDs


This somehow only works when inserted at the end of this array.
I was unable to do this with only one remarkup rule and i've got no idea why this now works when placed here.
Are remarkup rules applied here when another one already matched?

Let me see if I can figure out the order thing, everything else looks good to me.


Hmm, I'm surprised that these are order-dependent. Using two separate rules seems reasonable to me, though. Let me poke around at this locally...


This should be %Ld (list of integers) vs %Ls (list of strings).

epriestley edited edge metadata.

Ah -- the order thing is because the rules internally get sorted by getPriority(), and the sort isn't stable, so they're kind of ending up executing in "random" order.

After sorting, DiffusionRepositoryRemarkupRule happens to end up after DiffusionCommitRemarkupRule, but DiffusionRepositoryByIDRemarkupRule does not unless you put it at the end of the array and get lucky.

Both "Repository" and "RepositoryByID" should run after "Commit". If they don't, they consume the rAB or R12 prefix and leave an internal token in its place, so the commit rule can no longer match.

To enforce this, just implement getPriority() on both rules and have it return 460.0;. This will make sure they run after the Commit rule, which runs at priority 450.


We should initialize the identifier map in loadPage(), before executing the query:

if ($this->identifierMap === null) {
  $this->identifierMap = array();

Two reasons:

  • willFilterPage() may get called multiple times in one query. For example, if the caller requests 100 results, and we load 100 rows but then privacy-filter 50 of them, we'll go load another 100 results to complete the request.
  • If the query returns 0 results, willFilterPage() won't get called at all, so getIdentifierMap() will throw an exception ("you must call execute()") instead of returning an empty array.
This revision now requires changes to proceed.Jan 1 2015, 3:24 PM
fabe edited edge metadata.
  • enforce order of markup rules by setting priority (now works as expected)
  • initialize identifierMap only once loadPage() instead of willFilterPage()
  • %Ld instead of %Ls
fabe edited edge metadata.

Thanks for tracking this down!

This revision is now accepted and ready to land.Jan 1 2015, 4:07 PM
This revision was automatically updated to reflect the committed changes.