Page MenuHomePhabricator

Rebuild remarkup image proxying
Closed, ResolvedPublic

Description

Long ago, we had image proxying in remarkup (see T2000 / D3908). We removed it since it was a complicated mess and the implementation was far from good enough, and it was much easier to delete than fix.

We have somewhat better infrastructure now and have seen more demand for this. To restore the feature without the issues in T2000, we can:

  • Emit a link with JS metadata from remarkup.
  • Enrich the link on the client, like we do with Doorkeeper tags.
  • Use security.allow-outbound-http to control availability, without needing to add new config.

This will be a little bit flickery in some cases, but let us handle all the non-200 cases without doing huge blocking waits.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

When we build this, we should use the new rate limiting stuff to prevent this junk (Google Docs has had similar issues in the past):

http://chr13.com/2014/04/20/using-facebook-notes-to-ddos-any-website/

Basically, HTTP request amplification by making a comment with a thousand images in it. The amplification wouldn't be as bad as the Notes/Docs cases, but you could essentially amplify using other users' browsers by posting your comment somewhere where a lot of people would see it.

qgil added a subscriber: qgil.Apr 30 2014, 5:18 AM

◀ Merged tasks: T5139.

swisspol added a comment.EditedMay 21 2014, 5:23 PM

Why is image proxying needed (which I assume means transferring the image data from the other server through the Phabricator server) vs just allowing to embedding an image URL that points to another server?

A bunch of stuff, like:

  • Referrer leaks.
  • Users tracking other users.
  • Images being removed from remote servers at some later time.
  • Images changing on remote servers at some later time.
  • Creating load on remote servers.
  • Dealing with 404's and other errors in a user-friendly way instead of with a broken image.
  • Performance.

If you just want the ability for any user to add arbitrary <img /> tags and don't care about any of the above, drop a class like P1129 into phabricator/src/extensions/ and restart your webserver to extend remarkup. That example adds <code>, but you should be able to add <img> or similar easily enough by tweaking it.

The above concerns are collectively substantive enough (particularly, request tracking) that I don't want to put an unproxied bare tag in the upstream.

Makes total sense and thanks for the pointer!

I got the extension in place, but I think there's an override problem:

<img>http://images.apple.com/home/images/promo_macbookair.jpg</img>

Results in:

It seems the http:// is marked up first before the extension fires. Anyway to change this?

Override getPriority() and set it to a lower number than 300, e.g.:

public function getPriority() {
  return 200.0;
}

That should make it run before the normal link rule.

Thanks, I got to work and add support for <img> in remarkup: P1158.

@lpriestley, if this sounds like something reasonable to you, let me know and I can walk you through it in more detail.

kaya added a subscriber: kaya.Dec 30 2015, 9:19 PM

I have a more restricted use case, which is to proxy images from Phabricator-hosted Git repositories. For example, I'd like README.md in a repo to be able to reference images in the repo. without having to upload them as files.

@jshirley, here's the issue we spoke about earlier.

epriestley reassigned this task from epriestley to jcox.Aug 25 2016, 4:25 PM

The general shape of what we're trying to do here is that when a user writes the URL of an image in a comment (e.g., http://funny.lol/silly-dog.jpg, perhaps with some magic *$!symbols$*!) it would be nice to just show the image. Today, they'd have to download and re-upload it, which is somewhat cumbersome.

We had this feature in the past, but it was implemented in a questionable way that had not-so-gerat performance implications and a number of questionable security and user-facing behaviors, and was ultimately removed until we could build it more properly. Then, we never did.

Here's a pathway to rebuilding a more robust version:

  • Proxying endpoint: build an endpoint which proxies images from remote servers and makes an effort to cover all the security stuff we need to cover.
  • Caching: Add caching to the request, so we re-serve the image we already fetched if possible.
  • Remarkup: Implement a remarkup rule (probably {image ...} to start with) that emits <img src="..." />, pointing at the proxy URL.
  • Fancy remarkup: make the remarkup rule emit some JS instead which can recover from errors by rendering the link as a link instead of a broken image, etc. This is less important with an explicit {image ...} syntax, but was a more frequent issue when we just tried to embed any URL ending in an image suffix.

To build the proxying endpoint:

  • Create a new ImageProxyController in the Files application.
  • Add a route for it to FilesApplication.
  • Run arc liberate src/.
  • Hitting /file/imageproxy/ or whatever should now hit your controller.

In the controller:

  • When this controller "fails", it's OK to just throw an exception for now. Depending on where we go, we may want to have it redirect to an image of a sad animal that says "oh no it broke" instead eventually, since users may not be able to see the exception. We can tailor this behavior later.
  • Fail if security.allow-outbound-http is disabled.
  • Fail if phabricator.show-prototypes is disabled.
  • Read some uri from a URI parameter and validate it with PhabricatorEnv::requireValidRemoteURIForLink().
  • Additionally, require that the protocol is http or https explicitly. You can use PhutilURI for this and examine the implementation of requireValidRemoteURIForLink() as a reference, performing a more narrow check.
    • For now, reject relative URIs without an explicit protocol. We might allow this eventually, but I can't think of anything that's really useful to link to in Phabricator itself, except resources in Diffusion, which we should handle differently.
  • If the URI looks OK, apply rate limits:
    • Limit each viewer to 1,000 requests per hour.
    • Limit each remote domain to 1,000 requests per hour (prevent amplification attacks).
    • You can look at PhabricatorSystemActionEngine::willTakeAction(...) in PhabricatorMacroEditController as a general reference for how to tie into the SystemAction rate limiter.
  • If we pass rate limits, actually make the request using PhabricatorFile::newFromFileDownload().
  • Roughly follow the logic in PhabricatorMacroEditController to destroy the file if it isn't an image.
  • If it is an image, set the author to the viewer for now, then redirect to the file's view URI.

This should let you hit /file/imageproxy/?uri=http://funny.lol/silly-dog.jpg or similar and get the silly dog, provided all those checks pass, and you haven't made 1K requests from your account or to that domain in the last hour.

Get that far and then we can plan through caching and the rest of this stuff.

The vaguely related T5378 is also in the queue somewhere. It may not really intersect with this, but if we do end up trying to auto-embed any URI that ends in .jpg again it probably will. I'm not sure we should, but just a heads up for context.

Can we avoid the need for this entirely using the approach outlined in T11126 (using CSP or <iframe seamless referrerpolicy="no-referrer">)?

I think that only prevents referrer leaking -- it doesn't deal with the other issues in T4190#60441 (or request amplification attacks), right?

I've also never used seamless so I'm not sure how support looks in practice, but this isn't immediately reassuring:

http://caniuse.com/#feat=iframe-seamless

jcox added a comment.Sep 21 2016, 1:12 PM

I've got the endpoint and controller set up, but I get the following error when I call PhabricatorFile::newFromFileDownload:

All storage engines failed to write file:

  • PhabricatorMySQLFileStorageEngine: AphrontMalformedRequestException: You are trying to save some data to Phabricator, but the request your browser made included an incorrect token. Reload the page and try again. You may need to clear your cookies. This was a Web request. This request had no CSRF token. To avoid this error, use phabricator_form() to construct forms. If you are already using phabricator_form(), make sure the form 'action' uses a relative URI (i.e., begins with a '/'). Forms using absolute URIs do not include CSRF tokens, to prevent leaking tokens to external sites. If this page performs writes which do not require CSRF protection (usually, filling caches or logging), you can use AphrontWriteGuard::beginScopedUnguardedWrites() to temporarily bypass CSRF protection while writing. You should use this only for writes which can not be protected with normal CSRF mechanisms. Some UI elements (like PhabricatorActionListView) also have methods which will allow you to render links as forms (like setRenderAsForm(true)).

Is there an easy way to disable CSRF checks for now (or just handle it correctly)?

In this case, use the AphrontWriteGuard::beginScopedUnguardedWrites() method to surround the block -- you can grep for usage, but it will look like this:

$unguarded = AphrontWriteGuard::beginScopedUnguardedWrites();
// Do a safe write: logging, writing a cache, etc.
unset($unguarded);

Writes which are safe to unguard are usually either logging or cache fills, although there are a handful of other cases.

On the cache:

  1. We should cache all results, not just successful results. If a user enters a bad URL (like a 404), we should cache that we got a 404 or other bad/invalid result, so we don't keep making requests every time the user loads the page. If we do, this potentially leads to request amplification attacks by making requests to invalid resources on a victim's server. (This is partially mitigated by rate limiting.) This also probably gives us somewhat better behavior for resources which time out.
  2. Using PhabricatorKeyValueDatabaseCache means that the cache will expire routinely (via PhabricatorCacheGeneralGarbageCollector or bin/cache purge --purge-general by administrators) with no opportunity to run cleanup code, so we'll leave files stranded in storage. This probably isn't a big deal (we will still share the underlying file data blocks), but it would be nice to delete any file data when the cache expires.

Fixing (2) makes this kind of a bigger issue, though. We don't currently have a generic cache which supports destruction behavior, so you have to:

  • Add a new cache storage table.
  • Add a GC for it.
  • (Maybe: add a bin/cache purge flag for it? But some sort of web UI "try again" might be better?)

This is a good example of how to do that stuff though, so I think it's good as an onboarding exercise even if it's a lot of legwork for a small behavioral improvement.

To add a table:

  • Add a new storage/ object which extends PhabricatorFileDAO, like PhabricatorFileExternalRequest or similar.
  • Add a resources/sql/autopatches/ patch to create the table.
  • Add a GC like PhabricatorFileTemporaryGarbageCollector.
  • Just skip administrative purge behavior for now? There's a related task elsewhere with macro thumbnails and I think we might be able to fix them both with one new UI element. For testing, we can just truncate the table.

The actual cache table should have:

  • uri, a text field with no key (URIs may be arbitrarily long).
  • uriIndex, a PhabricatorHash::digestForIndex(...) digest of the URI. You can look at AlmanacProperty and AlmanacProperty->save() for usage.
  • ttl or similar, for when we'll throw the cache away.
  • Probably filePHID (nullable phid? I suppose) since it might be useful to query this from the file detail page ("Is this file a cached result of an external request?")
  • A properties field for all the other details (e.g., did this request work?).
  • Any other top-level fields you can imagine wanting to query by.

The index stuff is a little weird. MySQL won't build indexes over a certain length (IIRC, 768 bytes in InnoDB), so fields like uri which may be arbitrarily long can't have a key on the entire field value. In many cases we can limit the length of the name, but in the case of URIs we can't.

We can put a key on the first part of the field (like KEY `key_name` field(128), to index only the first 128 characters) but this only helps with querying and can't be used to enforce uniqueness. If that key is made UNIQUE, it enforces that the first 128 characters are unique, which would prevent users from accessing both http://longlonglong.com/.../.../.../cat.jpg and .../dog.jpg.

Since this cache should be unique on URI, the solution we employ is to hash the URI and put the unique key on that instead. This feels a little flimsy (like, something MySQL should be able to do for us automatically) and it's possible that there's some better approach, but it's the best approach I've come up with and seems to work well enough in practice.

Implement PhabricatorDestructibleInterface and have the object destroy the associated file when it is destroyed, then have the GC invoke destruction.

From there, we can fine-tune UI/TTL behaviors.

To actually add a remarkup rule, the basics are simple:

  • Subclass PhutilRemarkupRule, putting the new rule in src/applications/files/markup/.
  • Add the rule to PhabricatorFilesApplication->getRemarkupRules() to register it if Files is installed.
  • Then, implement the rule body.

I think there are at least three reasonable rule syntaxes we could use:

  1. {image ...} - Phabricator-style markup function.
  2. ![Alt Text](uri) - Markdown/GitHub-style image embed.
  3. uri.(jpe?g|gif|png) - Try to automatically detect bare URLs.

I think we should start with (1), since I expect we'll definitely want that no matter what. The major advantage is that you can use it to specify additional parameters, like this:

{image url=X, layout=Y, effect=Z, width=U, href=V}

The other syntaxes can't really support these options, so even if we eventually implement them for convenience, I think we should start with an {image ...} syntax as the full-power version.

You can look at {nav ...} (PhabricatorNavigationRemarkupRule) for a rough template of how to do this. I'd expect:

  • Match {image X}.
  • Test if X "looks like" a URI or not.
  • If it does, treat it as a URI.
  • If it does not, parse it with PhutilSimpleOptions.

Testing if it "looks like" a URI is a little tricky. In NavigationRemarkupRule, the current test is "Does it have an '=' sign?". But this isn't necessarily a great test (see T11663), and URIs are more likely to have = signs (thing.com/?a=b&c=d) anyway. A better test might be "does X match /uri\s*=/, i..e "uri <optional whitespace> =", which makes it more likely that we're dealing with parameters. Maybe a 99.99% heuristic is something like:

  • Does it have "://" before the first "="? If so, definitely a URI.
  • Does it have "uri<any whitespace>=" somewhere? If not, definitely a URI.
  • Otherwise, use PhutilSimpleOptions.

For now, we don't need to support any options other than uri (which could maybe have aliases url and src -- and {image ..} could reasonably have alias {img ...} too), but we can add alt, href, and layout/display options later.

After parsing everything, just generate an <img ... /> tag for now. That will make things work, just not perfectly -- primarily, errors will fail fairly silently, generating a broken image without any pathway forward for diagnosis or repair. We can improve that once this works.

(I'll follow up on this once I have a chance to bang on it a bit, but I want to try to finish up the RRULE stuff while it's still somewhat fresh in my mind.)

swisspol removed a subscriber: swisspol.Sep 27 2016, 5:11 PM
nochum added a subscriber: nochum.Nov 10 2016, 2:49 PM

Does the {img } thing work yet? I've been trying to use it locally but have not been able to get it to actually show an image. This would be super useful for us to be able to show the build status on a remarkup page (like the README.MD of our repo, for example). Thanks.

{img https://static-cdn.jtvnw.net/jtv_user_pictures/doritos-profile_image-4328d01909178d94-300x300.png}

(But note that it's cached for a trillion years, so it won't work for build status.)

(But note that it's cached for a trillion years, so it won't work for build status.)

Ugh, yeah I guess won't work for that purpose then. OOC, does it require any sort of config settings? I can't get it to work locally. Thanks!

  • Prototype applications must be enabled.
  • security.outbound-blacklist must not blacklist the target server (build servers may be on an internal network and on the default blacklist).
  • The link protocol must be "http" or "https".
  • You must not have exceeded your fetch quota for the last hour.
  • You must have file storage configured.
  • The image MIME type must be part of files.image-mime-types and files.viewable-mime-types.
  • The Phabricator server must be able to make outbound requests on the network.
J5lx added a subscriber: J5lx.Feb 4 2017, 5:16 PM
epriestley removed jcox as the assignee of this task.Apr 7 2017, 1:03 PM
epriestley moved this task from Intermediate to Backlog on the Contributor Onboarding board.
epriestley added a subscriber: jcox.

What's the status on this right now? Looks like a fun thing to try and get over the finish line.

I think:

  • There was some minor bug with it causing some errors in the log, but I don't think I saved the stack trace anywhere. Maybe keep an eye out.
  • If you write an invalid {img ...} like {img http://localhost:6789/cat.jpg}, you get a server-side exception and a broken image (visible here, depending on browser behavior):

  • This is bad and confusing. Instead, this should either be an image response:
+-------------------------+
|   < sad cat picture >   |
|                         |
| Invalid Image!          |
| This is invalid, sorry. |
+-------------------------+

...or, more likely, an "error" element using text so that we can translate it. Otherwise, we'd have to have a million 404_img_en.png, 404_img_fr.png, etc., or try to build them dynamically which means we need to be able to do CJK font rendering.

To make this an "error" element, I think it needs to work like this:

  • On the server, if there's no cache for the image yet, send down a placeholder (maybe a "loading" GIF, or just an empty <div />) with a piece of JS.
  • The JS calls the proxy over AJAX, and gets back either a "yes, we proxied the image, go fetch it here: XXX" response, or a "no, sorry, there was an error: YYY" response.
  • If the proxy says things were good, set the placeholder's source to the download URI and you're done.
  • If the proxy says things went bad, replace the placeholder with "error: sorry this didn't work" (the message the server sent, in the viewer's local language).

Additionally, when the cache exists but we've cached an error ("yes, we requested this image, and we got a 404 back") we should send down either the raw HTML for the error or an empty placeholder with JS to render the error, so it looks the same as if we did a fresh fetch.

Basically, piles of Javascript to make error-handling edge cases work better, which is everyone's definition of "fun".

(None of it should necessarily be too ridiculous, although I think previews may be a little weird?)

Once all that works, we can add more features and syntax as needs arise.

A related issue is that when you use {Fxxx} to embed an image as a thumbnail, we can fail with a default thumbnail if:

  • The image is corrupt.
  • The image is too large (size on disk).
  • The image is too large (number of pixels).
  • The image takes too long to convert into a thumbnail (for example, when processing animated GIFs with imagemagick).
  • gd is not installed.
  • The actual thumbnailing fails (e.g. disk full).
  • Probably other reasons.

The thumbnail looks weird and doesn't make the failure reason obvious, and in the case of the "too large" failures we might be better off falling back to a {Fxxxx, layout=inline} sort of thing.

This situation largely parallels the {img url} situation, where we don't want to determine whether the resource is valid or not on the server side for performance reasons but currently fail gracelessly. A similar approach is probably the best pathway forward.

I've removed the requirement that prototypes be enabled and marked this as fixed in D19195. The behavior could probably be refined, but I think it should fail in an obvious way when it fails, now.