Changeset View
Standalone View
src/applications/metamta/storage/PhabricatorMetaMTAMailProperties.php
- This file was added.
| <?php | |||||
| final class PhabricatorMetaMTAMailProperties | |||||
| extends PhabricatorMetaMTADAO | |||||
| implements PhabricatorPolicyInterface { | |||||
| protected $objectPHID; | |||||
| protected $mailProperties = array(); | |||||
| protected function getConfiguration() { | |||||
| return array( | |||||
| self::CONFIG_SERIALIZATION => array( | |||||
| 'mailProperties' => self::SERIALIZATION_JSON, | |||||
| ), | |||||
amckinley: I'm not actually worried about this, but have you tested the performance impact of fetching a… | |||||
Not Done Inline ActionsThe big advantage of blobs is that we can add more stuff without migrations. Adding new properties to some tables (like HarbormasterBuildLog, recently) potentially means hours and hours of downtime for large installs. The big advantage of real columns is that we can query on them (and add keys, and join them, etc). Decoding JSON is normally very fast, and I believe there's no meaningful inherent performance difference between columns and JSON blobs in most cases. I haven't specifically benchmarked it recently, but I can't recall these JSON decodes ever showing up in a profile while doing performance work. JSON is slightly less space-efficient (e.g., column names are repeated) but this usually isn't an issue (we've never hit a case where we optimized a table's size by moving away from JSON, at least that I can recall offhand). Increasingly, a lot of objects tend to have their major properties as columns (when we need keys, constraints, etc) and then a $properties-type field for whatever other random stuff turns up. In this case, it's (almost certainly?) clear that nothing will ever want to query/join/constrain mailKey since it's a random secret, so the advantages of column storage are moot. The other two potential things we know about that we may want to store here in the future ("Must Encrypt" state, "Original Subject") also (almost certainly?) don't need to be queried. This table may also become large-ish (the rows are small, but it may have 1 row per object that exists in any other application) so migrations might be expensive in the future. Thus, a bag-of-JSON-properties felt like a better fit than a dedicated column. epriestley: The big advantage of blobs is that we can add more stuff without migrations. Adding new… | |||||
Not Done Inline Actions
Oh, this is another advantage of the JSON blobs: with reasonable frequency, we use them to store maps or lists. So we could use an <object, name, value> table, but then the value would need to be JSON ([1, 2, 3]) anyway, or we'd need another <propertyID, uhIndex?, value> or something, and then another one under that if the lists are nested. Basically a big mess that is either super hard/complicated or just ends up storing a lot of JSON anyway. epriestley: > And just out of curiosity, why do we prefer blobs of JSON vs tables with (objectID… | |||||
| self::CONFIG_COLUMN_SCHEMA => array(), | |||||
| self::CONFIG_KEY_SCHEMA => array( | |||||
| 'key_object' => array( | |||||
| 'columns' => array('objectPHID'), | |||||
| 'unique' => true, | |||||
| ), | |||||
| ), | |||||
| ) + parent::getConfiguration(); | |||||
| } | |||||
| public function getMailProperty($key, $default = null) { | |||||
| return idx($this->mailProperties, $key, $default); | |||||
| } | |||||
| public function setMailProperty($key, $value) { | |||||
| $this->mailProperties[$key] = $value; | |||||
| return $this; | |||||
| } | |||||
| public static function loadMailKey($object) { | |||||
| // If this is an older object with an onboard "mailKey" property, just | |||||
| // use it. | |||||
| // TODO: We should eventually get rid of these and get rid of this | |||||
| // piece of code. | |||||
| if ($object->hasProperty('mailKey')) { | |||||
| return $object->getMailKey(); | |||||
| } | |||||
Not Done Inline ActionsShouldn't we be checking this the other way around? First see if there's a PhabricatorMetaMTAMailProperties object, and if not, fetch $object->getMailKey() and save the result into the new table? amckinley: Shouldn't we be checking this the other way around? First see if there's a… | |||||
Not Done Inline ActionsI don't want to migrate things passively one at a time because I think it just makes things more complicated (what's in the table is less clear/knowable), and the ultimate migration needs to hit everything anyway since we can't guarantee that passive migration will ever complete. I don't think it matters too much in this case, but I think doing a trickle-migration tends to make it harder to make changes here (e.g., if nothing else, the table is larger) without any concrete benefits (at least, I can't think of any). epriestley: I don't want to migrate things passively one at a time because I think it just makes things… | |||||
| $viewer = PhabricatorUser::getOmnipotentUser(); | |||||
| $object_phid = $object->getPHID(); | |||||
| $properties = id(new PhabricatorMetaMTAMailPropertiesQuery()) | |||||
| ->setViewer($viewer) | |||||
| ->withObjectPHIDs(array($object_phid)) | |||||
| ->executeOne(); | |||||
| if (!$properties) { | |||||
| $properties = id(new self()) | |||||
| ->setObjectPHID($object_phid); | |||||
| } | |||||
| $mail_key = $properties->getMailProperty('mailKey'); | |||||
| if ($mail_key !== null) { | |||||
| return $mail_key; | |||||
| } | |||||
| $mail_key = Filesystem::readRandomCharacters(20); | |||||
| $properties->setMailProperty('mailKey', $mail_key); | |||||
| $unguarded = AphrontWriteGuard::beginScopedUnguardedWrites(); | |||||
| $properties->save(); | |||||
| unset($unguarded); | |||||
| return $mail_key; | |||||
| } | |||||
| /* -( PhabricatorPolicyInterface )----------------------------------------- */ | |||||
| public function getCapabilities() { | |||||
| return array( | |||||
| PhabricatorPolicyCapability::CAN_VIEW, | |||||
| ); | |||||
| } | |||||
| public function getPolicy($capability) { | |||||
| switch ($capability) { | |||||
| case PhabricatorPolicyCapability::CAN_VIEW: | |||||
| return PhabricatorPolicies::POLICY_NOONE; | |||||
| } | |||||
Not Done Inline Actions(We load this with the Omnipotent viewer, which skips the actual check. This object only really implements PhabricatorPolicyInterface so that the Query can be a CursorPagedPolicyAwareQuery.) epriestley: (We load this with the Omnipotent viewer, which skips the actual check. This object only really… | |||||
| } | |||||
| public function hasAutomaticCapability($capability, PhabricatorUser $viewer) { | |||||
| return false; | |||||
| } | |||||
| } | |||||
I'm not actually worried about this, but have you tested the performance impact of fetching a ton of these JSON blobs and deserializing them, compared to just reading the mailKey out of the already-fetched row? I'm trying to imagine what kind of operations could result in fetching a linear number of these blobs, but nothing leaps to mind. Maybe Herald rule evaluations?
And just out of curiosity, why do we prefer blobs of JSON vs tables with (objectID, propertyName, propertyValue) columns?