Page MenuHomePhabricator

Allow file data to be inaccessible, not undiscoverable
Closed, ResolvedPublic

Description

File data URIs are currently undiscoverable (contain a large, random, unguessable secret) but not inaccessible: if you know the URL, you can download the data.

This approach makes the data CDN'able, and is similar to the system used by Facebook and Google for user content.

It's also unsafe to serve files from the primary domain, and installs with exposure should configure security.alternate-file-domain. We should also urge installs to do this more strongly (see T2380, T2382). One specific attack which serving file content from the same domain makes possible is that iPads which do not respect Content-Disposition will just execute Javascript in HTML file content, even when the server tells them to download it. There are similar attacks with Flash and Java which are more complicated but can affect more user agents.

I'm not aware of any practical attack against this system. It is computationally infeasible to enumerate the secrets.

That said, some installs find it uncomfortable that knowledge of a URL is sufficient to retrieve file data, and it is possible for these URLs to leak through side channels (log files, screenshots, accidental indexing, etc.) more easily than the file data itself can. We could pursue partial solutions (like generating URLs that are valid for a short duration) easily, but this won't address the root issue of the scheme "feeling" insecure because it lacks a formal authentication step.

Adding a formal authentication step is complicated. Particularly:

  • Normal session cookies can not be present on the domain the file is served from, because this permits the class of user-content attacks above.
  • Doing some sort of session handshake will break and/or ruin performance for files like profile images.

Event Timeline

epriestley raised the priority of this task from to Normal.
epriestley updated the task description. (Show Details)
epriestley added a project: Files.
epriestley added subscribers: epriestley, chasemp.

Here's a possible approach:

  • We let files be marked as "can CDN".
  • We set that flag on all existing files, and on profile pictures and any other kind of public artifact (maybe thumbnails).
  • These files work exactly the same way that they currently do.
  • This solves the issues with profile images and other files we really want to CDN, which users reasonably expect to be highly public.

For files not marked as "can CDN", we require a one-time token to view the file data.

  • If the file is accessed with a missing or invalid token, we redirect to Phabricator to get the token.
  • Phabricator generates a token with the existing one-time infrastructure, and redirects back to the file (this could loop, which is bad, maybe we can put a counter in the URL or something too).
  • The token is valid for a very short period of time (like 5 minutes).
  • The token is consumed by the data request.

There are two cases which are still a bit ugly:

Thumbnails of private images. We can either mark them "can CDN" (this seems OK, but might allow an attacker to view the image if we eventually introduce some 1000x1000 thumbnail size), or embed links with no tokens (but this will do a bunch of redirects for each thumb) or embed links with tokens (but this is a huge mess).

I lean toward letting small thumbs CDN and dealing with large ones later if the issue ever arises. This seems like a reasonable tradeoff.

Full Images in Pholio and Lightboxes. Same issue as above. We can probably pre-generate the tokens, then eat the redirects if they expire.

@epriestley: this sounds like pretty solid solution to me.

@epriestley: I might be able to implement this or at least take a stab at it.

Sure, go for it. I won't get here until T4896 -> T4589 -> here, but those don't really block this.

I think "can CDN" can just be a metadata property. The token stuff can use PhabricatorAuthTemporaryToken.

woot! Thanks @epriestley, you've been incredibly helpful as always.