The [[ ... ]] syntax accepts and handles characters which would require URL encoding if they appeared in URIs. For example, [[ 100% Natural Cheese Dust ]] is a legitimate, supported piece of remarkup syntax, and does not need to be written as ... 100%25 Natural ....
Likewise, [[ BUY $DOGE ]] is legitimate and does not need to be written as [[ BUY %24DOGE ]]. This piece of syntax creates a link to /w/buy_$doge/. This may or may not appear in your browser's URL bar as /w/buy_%24doge/, but internally "$" is a valid slug character and you'll see buy_$doge over Conduit, etc.
However, since users may reasonably copy paths from their browser URL bar, they may have unnecessary URL encoding. The syntax [[ buy_$doge ]] is legitimate, but a user copy/pasting may write [[ buy_%24doge ]] instead.
Currently, this extra URL encoding causes links to break, since [[ buy_%24doge ]] is a treated as link to /w/buy_24doge/, just like [[ Fresh 100%AB Blood ]] is a link to /w/fresh_100_ab_blood/.
To fix this:
- When the target for a link can be URL decoded, try to do lookups on both the un-decoded and decoded variations.
- If the un-decoded variation works, great: use it. This preserves behavior for all existing, working links.
- If the un-decoded variation fails but the decoded variation works, okay: we'll assume you copy-pasted a URL-encoded version and strip URL encoding.
- If both fail, same behavior as before.
Also, use a different spelling for "existent".
See T13084 for some "attacks" based on this behavior. I think the usability affordance this behavior provides greatly outweighs the very mild threat those attacks represent.
- Created links to existing, nonexisting, and existing-but-not-visible documents, all of which worked normally.
- Created links to [[ $doge ]] and [[ %24doge ]], saw them both go to the right place.
- Performed the "attacks" in T13084.
Differential Revision: https://secure.phabricator.com/D19106