Changeset View
Standalone View
src/applications/files/controller/PhabricatorFileImageProxyController.php
- This file was added.
| <?php | |||||
| final class PhabricatorFileImageProxyController | |||||
| extends PhabricatorFileController { | |||||
| public function shouldAllowPublic() { | |||||
| return true; | |||||
| } | |||||
| public function handleRequest(AphrontRequest $request) { | |||||
| $show_prototypes = PhabricatorEnv::getEnvConfig( | |||||
| 'phabricator.show-prototypes'); | |||||
| if (!$show_prototypes) { | |||||
| throw new Exception( | |||||
| pht('Show prototypes is disabled. | |||||
| Set `phabricator.show-prototypes` to `true` to use the image proxy')); | |||||
| } | |||||
| $viewer = $request->getViewer(); | |||||
| $img_uri = $request->getStr('uri'); | |||||
epriestley: Unless you need to, prefer `$request->getStr('...')` (or other typed accessors) over reading… | |||||
| // Validate the URI before doing anything | |||||
| PhabricatorEnv::requireValidRemoteURIForLink($img_uri); | |||||
| $uri = new PhutilURI($img_uri); | |||||
| $proto = $uri->getProtocol(); | |||||
| if (!in_array($proto, array('http', 'https'))) { | |||||
Done Inline ActionsThis should query by uriHash = %s ... PhabricatorHash::digestForIndex($img_uri), since that column is the one with a key on it. Once the table gets big, query using a keyed column will execute much more quickly than this query. (You can also omit LIMIT 1 since there's a unique key, and if we get more than one result back somehow it's probably good for loadOneWhere() to throw an exception instead of picking the first result.) epriestley: This should query by `uriHash = %s` ... `PhabricatorHash::digestForIndex($img_uri)`, since that… | |||||
Done Inline Actionsah yep. that was an oversight on my part. jcox: ah yep. that was an oversight on my part. | |||||
| throw new Exception( | |||||
| pht('The provided image URI must be either http or https')); | |||||
| } | |||||
| // Check if we already have the specified image URI downloaded | |||||
| $cached_request = id(new PhabricatorFileExternalRequest())->loadOneWhere( | |||||
| 'uriIndex = %s', | |||||
| PhabricatorHash::digestForIndex($img_uri)); | |||||
Done Inline ActionsYou can executeOne() above (instead of execute()) to avoid this (although maybe you're trying to dodge policy exceptions)? epriestley: You can `executeOne()` above (instead of `execute()`) to avoid this (although maybe you're… | |||||
Done Inline ActionsI copied this pattern from elsewhere so there wasn't much reason behind it, other than I didn't know about executeOne. jcox: I copied this pattern from elsewhere so there wasn't much reason behind it, other than I didn't… | |||||
| if ($cached_request) { | |||||
| return $this->getExternalResponse($cached_request); | |||||
| } | |||||
| $ttl = PhabricatorTime::getNow() + phutil_units('7 days in seconds'); | |||||
Done Inline ActionsFor now, you could just throw an exception here (similar to what we'll do later on). Actually, can you restructure things a bit so that the code is shared in both cases? Maybe something like: $response = $this->getExternalResponse($cached_external_request);
if ($response) {
return $response;
}
// <...do all the cache stuff...>
return $this->getExternalResponse($freshly_generated_external_request);Maybe it can't look quite like that, but I think the logic for "turn an ExternalRequest object into a response" is the same in both cases, and it should be possible to consolidate them. epriestley: For now, you could just throw an exception here (similar to what we'll do later on).
Actually… | |||||
| $external_request = id(new PhabricatorFileExternalRequest()) | |||||
| ->setURI($img_uri) | |||||
| ->setTTL($ttl); | |||||
| $unguarded = AphrontWriteGuard::beginScopedUnguardedWrites(); | |||||
| // Cache missed so we'll need to validate and download the image | |||||
| try { | |||||
| // Rate limit outbound fetches to make this mechanism less useful for | |||||
| // scanning networks and ports. | |||||
| PhabricatorSystemActionEngine::willTakeAction( | |||||
| array($viewer->getPHID()), | |||||
Done Inline ActionsWe should perform these validations before checking the cache so that if there's, say, a bug in requireValidRemoteURIForLink() then we can just push a fix and it will apply retroactively. By doing these checks later, existing security issues/buggy data already in the cache might remain active after a bugfix. epriestley: We should perform these validations //before// checking the cache so that if there's, say, a… | |||||
| new PhabricatorFilesOutboundRequestAction(), | |||||
| 1); | |||||
| $file = PhabricatorFile::newFromFileDownload( | |||||
| $uri, | |||||
| array( | |||||
| 'viewPolicy' => PhabricatorPolicies::POLICY_NOONE, | |||||
| 'canCDN' => true, | |||||
| )); | |||||
| if (!$file->isViewableImage()) { | |||||
| $mime_type = $file->getMimeType(); | |||||
| $engine = new PhabricatorDestructionEngine(); | |||||
| $engine->destroyObject($file); | |||||
| $file = null; | |||||
| throw new Exception( | |||||
| pht( | |||||
| 'The URI "%s" does not correspond to a valid image file, got '. | |||||
| 'a file with MIME type "%s". You must specify the URI of a '. | |||||
| 'valid image file.', | |||||
| $uri, | |||||
| $mime_type)); | |||||
| } else { | |||||
| $file->save(); | |||||
| } | |||||
Not Done Inline ActionsOh, one other thing is that we should probably not set the authorPHID here -- see some discussion below. epriestley: Oh, one other thing is that we should probably //not// set the authorPHID here -- see some… | |||||
| $external_request->setIsSuccessful(true) | |||||
| ->setFilePHID($file->getPHID()) | |||||
| ->save(); | |||||
| unset($unguarded); | |||||
| return $this->getExternalResponse($external_request); | |||||
| } catch (HTTPFutureHTTPResponseStatus $status) { | |||||
| $external_request->setIsSuccessful(false) | |||||
| ->setResponseMessage($status->getMessage()) | |||||
| ->save(); | |||||
| return $this->getExternalResponse($external_request); | |||||
| } catch (Exception $ex) { | |||||
Done Inline ActionsThis seems fine to me. epriestley: This seems fine to me. | |||||
| // Not actually saving the request in this case | |||||
| $external_request->setResponseMessage($ex->getMessage()); | |||||
| return $this->getExternalResponse($external_request); | |||||
| } | |||||
| } | |||||
Not Done Inline ActionsI think it's probably possible for this to race:
I suspect both pages will fairly regularly get through the initial check (cache is empty) and run the request, then whichever one is a little slower will fail here with an AphrontDuplicateKeyQueryException when it tries to save a second copy of the same record. Some ways we could deal with this:
I think (1) is fine for now since multiple copies of an image on a page will be rare. When this kind of request-based race happens we usually do something in the vein of (2), but I think there are at least some arguments for doing (3) instead here: if we don't lock, the duplicate requests will write duplicate files; and they'll incur duplicate points on the user's rate limit; and they'll make duplicate requests to the remote server. These costs are small but higher than the costs we usually face when choosing to catch the exception and load the colliding record. The lock would look something like this: try_the_cache(); $lock = PhabricatorGlobalLock::newLock($lock_name_based_on_URI_hash); try_the_cache_again(); make_the_request_and_write_the_data_and_cache(); $lock->unlock(); Handling this is probably worth splitting out into a followup change, since I'd guess it will be somewhat hard to hit this in the first place. epriestley: I think it's probably possible for this to race:
- Clear the cache.
- Load a page with two… | |||||
| private function getExternalResponse( | |||||
| PhabricatorFileExternalRequest $request) { | |||||
| if ($request->getIsSuccessful()) { | |||||
| $file = id(new PhabricatorFileQuery()) | |||||
| ->setViewer(PhabricatorUser::getOmnipotentUser()) | |||||
| ->withPHIDs(array($request->getFilePHID())) | |||||
| ->executeOne(); | |||||
| if (!file) { | |||||
| throw new Exception(pht( | |||||
| 'The underlying file does not exist, but the cached request was '. | |||||
| 'successful. This likely means the file record was manually deleted '. | |||||
Done Inline ActionsTo slightly reduce the amount of code here, you could create this object earlier, and then share it in both the success and failure branches. epriestley: To slightly reduce the amount of code here, you could create this object earlier, and then… | |||||
Not Done Inline ActionsIt's possible (albeit unlikely) that $file will come back null here. An administrator would pretty much have needed to go delete the file out from under our noses, but it would be a little cleaner to throw an explicit exception instead of fataling on $file->getViewURI() below. epriestley: It's possible (albeit unlikely) that `$file` will come back `null` here. An administrator would… | |||||
Not Done Inline ActionsAh good call. If the file is null should we just go ahead and delete the cached request? Or should we leave it to the user to fix it, since there might be other weirdness that they were messing with? jcox: Ah good call. If the file is `null` should we just go ahead and delete the cached request? Or… | |||||
| 'by an administrator.')); | |||||
| } | |||||
| return id(new AphrontRedirectResponse()) | |||||
| ->setIsExternal(true) | |||||
| ->setURI($file->getViewURI()); | |||||
| } else { | |||||
| throw new Exception(pht( | |||||
| "The request to get the external file from '%s' was unsuccessful:\n %s", | |||||
| $request->getURI(), | |||||
| $request->getResponseMessage())); | |||||
| } | |||||
| } | |||||
| } | |||||
Unless you need to, prefer $request->getStr('...') (or other typed accessors) over reading the entire data directly.
In PHP, users can request ?uri[x]=y, which PHP parses as an array. Then you end up with an array-valued $img_uri and probably a fatal a few lines below (but, in extreme cases, you can get a security vulnerability instead).
To test this, you should be able to access /file/imageproxy/?uri[]=anything on this code. I expect you'll get a fatal or some bizarre behavior.