Fixes T5378. Allow object references to optionally include the full URI. This means that http://phabricator.example.com/T123 is rendered equivalently to T123.
Details
- Reviewers
joshuaspence - Group Reviewers
Blessed Reviewers - Maniphest Tasks
- T5378: Phabricator full task URLs for that instance do not provide hover/popup text (like T1234 does)
= These work as expected, rendering as objects = D1 {D1} http://phabricator.local/T1 {https://phabricator.local/T1} HTTP://PHABRICATOR.LOCAL/D1 http://phabricator.local/rARC977baacc329ce690d4e5c0304f4c50e352e7644f #foobar = These are not very well defined = http://phabricator.local/T1.html = These work as expected, rendering as hyperlinks = http://phabricator.local/ http://phabricator.local/xyz http://phabricator.local/xyz/D1 http://phabricator.local/T1xyz http://phabricator.local/?id=T1&key=value http://secure.phabricator.com/T1 http://phabricator.local/D1/D2 = These don't work = '#foobar' http://phabricator.local/#foobar http://phabricator.local/tag/foobar http://phabricator.local/project/view/1/ http://phabricator.local/diffusion/TEST/
Diff Detail
- Repository
- rP Phabricator
- Branch
- master
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 5731 Build 5750: [Placeholder Plan] Wait for 30 Seconds
Event Timeline
Ideally, it would be nice to run this rule before the URI rule so we don't have to do the URI exception stuff. Particularly, if you type http://install.com/path/to/something/ (that is, an install URI but not a monogram), this approach as written will leave it unlinked.
You can run this rule first by adjusting getPriority() to a number smaller than the hyperlink priority. This may have the undesirable side effect of causing us to start matching monograms in URIs (or in other weird contexts, maybe), but perhaps that isn't too big of an issue and we can avoid it by being more careful with the regexps.
If that isn't fruitful, my next best idea is to let rules reach into the matched hyperlink list and update the tokens. We don't have any rules like this right now, but theoretically it should be possible to do something like:
$uri_rule = $this->getEngine()->getTheRuleThatMatchesLinks(); foreach ($uri_rule->getMatchedURIs() as $key => $uri) { if (the $uri is on this install) { $uri_rule->useThisTokenInsteadOfTheDefault($key, $pretty_monogram_link); } }
This would probably make hard for us to get {http://.../} to work, but seems conceptually cleaner if we're willing to give that up (I think it's reasonable for it to not work, but I also think it's reasonable to consider it important for it to work).
// These work as expected. D1 http://phabricator.local/ http://phabricator.local/D1 http://phabricator.local/foobar http://phabricator.local/T1xyz // These don't work. http://phabricator.local/D1/D2 http://phabricator.local/T1.html http://phabricator.local/?id=T1&key=value
src/infrastructure/markup/rule/PhabricatorObjectRemarkupRule.php | ||
---|---|---|
175 | I suspect we need to match "a bunch of stuff that looks like it's probably a URI here", and then check if it's a URI we care about in the callback, so that these work:
Probably something like "some non-whitespace stuff", then "://", then "some more stuff with no whitespace", then "a slash"? |
// These work as expected. D1 http://phabricator.local/ http://phabricator.local/D1 http://phabricator.local/xyz http://phabricator.local/xyz/D1 http://phabricator.local/T1xyz http://phabricator.local/?id=T1&key=value http://secure.phabricator.com/T1 https://phabricator.local/D1 HTTP://PHABRICATOR.LOCAL/D1 // These don't work. http://phabricator.local/D1/D2 (specifically D1) http://phabricator.local/T1.html
src/infrastructure/markup/rule/PhabricatorObjectRemarkupRule.php | ||
---|---|---|
289 | Err, ignore this :P |
http://phabricator.local/T1.html is going to be difficult to get right because it could be used as follows:
http://phabricator.local/T1.this is a new sentence.
I think it's correct/reasonable that http://blah/T1.blah doesn't work -- it wouldn't work as a normal URI either.
Ah, this also causes a unit test to fail (I noticed this whilst submitting D12694):
FAIL ProjectRemarkupRuleTestCase::testProjectObjectRemarkup Assertion failed, expected values to be equal (at ProjectRemarkupRuleTestCase.php:119): I like #ducks. Expected vs Actual Output Diff --- Old Value +++ New Value @@ -1,18 +1,19 @@ Array ( [embed] => Array ( ) [ref] => Array ( [0] => Array ( - [offset] => 8 - [id] => ducks + [offset] => -1 + [id] => + [tail] => ducks ) ) )
// These work as expected. D1 {D1} http://phabricator.local/ http://phabricator.local/D1 http://phabricator.local/xyz http://phabricator.local/xyz/D1 http://phabricator.local/T1xyz http://phabricator.local/?id=T1&key=value http://secure.phabricator.com/T1 https://phabricator.local/D1 HTTP://PHABRICATOR.LOCAL/D1 http://phabricator.local/T1.html {http://phabricator.local/D1} I like #foobar. // These don't work. http://phabricator.local/D1/D2 (specifically D1) I like '#foobar'.
What are your thoughts on this?I think that this approach is worthwhile and that we can probably fix the remaining edge cases.
On another note... Would landing this generate mentions retroactively or would it only generate mentions for new comments?
Yeah, this looks like it's going to work well to me.
This won't generate mentions retroactively, only for new comments.
One thing that might make this easier is a way to easily resolve URI to monogram. Maybe we could achieve this by adding a monogram.query Conduit method (somewhat similar to the phid.lookup and phid.query methods that we have currently.
What problem are you running into which would be solved by a URI -> monogram mapping? It looked like this is basically in a working state? Or is there more stuff that came up which turns out to be harder?
I think that it is maybe wrong to assume that http://phabricator.example.com/$x is the URL for an object with the monogram of $x. I guess this works for most cases, but it fails for project hashtags. That is http://phabricator.example.com/#some_project shouldn't be parsed as being equivalent to #some_project. Really, I guess this is only a problem for URLs which are not very meaningful anyway (e.g. I don't expect that anyone would ever link to http://phabricator.example.com/#some_project).
Additionally, have a convenient way to map URLs to objects would allow us to do the following:
- Map http://phabricator.local/project/view/1/ to #some_project.
- Map http://phabricator.local/tag/some_project to #some_project.
- Map http://phabricator.local/diffusion/TEST to rTEST.
- Maybe be more complex things like mapping http://phabricator/config/blah to some nav sequence.
Yeah, I think those cases aren't very important, and could probably be handled by adding a few special rules later on.
@epriestley, this definitely needs some more work in its current form... but what do you think about my revised approach?
Oh, sorry: I think this is totally reasonable and we should definitely move forward with it. The only thing I'd like to adjust before landing it is supporting all URIs in phabricator.allowed-uris, rather than only phabricator.base-uri.
(I think it's also possible to run remarkup without having base-uri set, so we might need to test it for null.)
Sure, but I personally still prefer my latest approach on this. One big reason is that it handles http://phabricator.local/T1?workflow=create (I don't think my previous attempt would handle this case).
Oh, maybe I'm misunderstanding -- is this diff not the revised approach? What's the revised approach?
Roughly, the idea here is that if something looks like a URL, then take the path of the URL and try to process it as an object reference. If it doesn't match an object reference pattern, then render it as a URL (by proxying to the PhutilRemarkupHyperlinkRule rule).
src/infrastructure/markup/rule/PhabricatorObjectRemarkupRule.php | ||
---|---|---|
11 | Oh this isn't used... I was planning to add strings which had been matched as a URL (but then rejected because the URL didn't match an object) so that we could reject them in the markupObjectEmbed and markupObjectReference methods. But I don't think that we need to do this. | |
152–153 | We could maybe add markupObjectHyperlinkEmbed(), but I don't think it would be used very often as the users who are more likely to use hyperlinks instead of object references probably don't know about the embed syntax anyway. | |
208 | This was copied from PhutilRemarkupHyperlinkRule. | |
319–325 | This is a little odd and I wanted to just do $this->apply($path) unconditionally... but the issue was that I couldn't return whether $path had actually been matched by $this->getObjectEmbedPattern() or $this->getObjectReferencePattern(). | |
329–330 | I don't understand why return $rule->apply($matches[0]) doesn't work here. |
src/infrastructure/markup/rule/PhabricatorObjectRemarkupRule.php | ||
---|---|---|
320 | I think a link like http://secure.phabricator.com/T1 should display like this in the grey bubble: [ http://secure.phabricator.com/T1 ] ...not like this: [ T1 ] That is, we should recognize it and link it up, but visually retain the fact that it's a URL reference, not a monogram reference, so the actual bubble text is the URL, not just the monogram. This is probably a little worse for new/lazy users, but gets us most of the actual core value (mentions, hovercards, etc) and I think it's better for experienced users (specifically, it can be frustrating when you type text which includes a URL and your meaning is altered by summarization of the token, and retraining the URL avoids that). Generally, this reduces the frustration the rule can cause in cases where we get things wrong. | |
327–330 | Do we need to do this at all? Can we just return $matches[0] and let the other rule pick it up later? |
src/infrastructure/markup/rule/PhabricatorObjectRemarkupRule.php | ||
---|---|---|
208 | If you prefer less magic, you can remove MAGIC_BYTE from the regex -- matching strings which contain a MAGIC_BYTE will be rejected by isFlatText() for not being flat text. The MAGIC_BYTE stuff predates the concept of "flat text". But this is also fine as-is. |
src/infrastructure/markup/rule/PhabricatorObjectRemarkupRule.php | ||
---|---|---|
320 | The idea is that the domain checking happens a few lines before this: if (strtolower($uri->getDomain()) != strtolower($base_uri->getDomain())) { return $matches[0]; } | |
327–330 | I don't think we can do that... my concern was that the following would happen:
The intention was to short-circuit this straight to PhutilRemarkupHyperlinkRule. |
Oh -- you're trying to avoid the (?<!=|:#/()<>) negative lookbehind?
I'm not sure exactly what's going wrong with the proxying, but it won't work in the general case because the Engine never learns about the rule (you tell the rule about the engine, but not the engine about the rule). Later in execution, the Engine iterates over all the rules it is running and calls methods on them (like didMarkupText()). Although I don't immediately see exactly what's going wrong (e.g., Hyperlinks do not depend on having didMarkupText() run, I don't think) there are a class of cases like this where something does happen in those methods, and where proxying won't work.
Maybe it's easier to markup object embeds first, then do references and URIs in a single pass? Basically run the rule:
'/(match-uri)|(match-reference)/'
...instead of running the rules separately. Then figure out which one you matched in the callback. If it's a URI, do what you do now and return $matches[0] if you fail. If it's a reference, it skips all the URI stuff and just goes down the normal path. That should make sure text which matched the URI pattern doesn't have a chance to get examined by the ref pattern.
Or a broader negative lookbehind like (?<!\S{3,}://.*) might work too.
@joshuaspence I'm wondering if you had any plans to do work on this since a number of our users are using full URLs :/
I think I have a "working" change for this:
https://secure.phabricator.com/differential/diff/42106/
...but I'm not really satisfied with the approach. Particularly, it won't generalize to objects where the URI pattern and the monogram pattern differ. It would be nice for any approach here to have the potential to annotate any URI, even if it's something like /diffusion/browse/path/to/file.txt.
I'm currently thinking this needs to be three separate passes (the two existing passes, plus one new pass), with the first pass decoupling URI annotation from monogram matching:
- One pass to extract {self-uri} and self-uri, which needs to be able to delegate most of its rendering into PhabricatorObjectRemarkupRule (this is the tricky part, I guess).
- One pass to extract normal URIs (essentially unchanged from current code).
- One pass to extract {monogram} and monogram (essentially unchanged from current code).