Page MenuHomePhabricator

Addressing some PHP 8 incompatibilities
ClosedPublic

Authored by cspeckmim on May 8 2023, 12:12 AM.
Tags
None
Referenced Files
F14107029: D21862.diff
Wed, Nov 27, 7:11 AM
Unknown Object (File)
Wed, Nov 27, 1:05 AM
Unknown Object (File)
Tue, Nov 26, 2:12 PM
Unknown Object (File)
Tue, Nov 26, 8:26 AM
Unknown Object (File)
Tue, Nov 26, 6:54 AM
Unknown Object (File)
Sun, Nov 24, 9:28 PM
Unknown Object (File)
Sun, Nov 24, 4:39 PM
Unknown Object (File)
Sun, Nov 24, 3:48 PM

Details

Summary

Starting with a new instance running PHP 8.2, address all exceptions that come up through some basic browsing/usage.

For strlen(null) issues I generally tried to resolve if the value should be non-null at the point of issue, and attempt to address at earlier call-site. There were not many of these that I could determine. In the rest of those cases I would replace with a null-and-strlen check, or use phutil_nonempty_string if I was certain the value was a string and it was more convenient.

Hitting all code-paths is challenging, so I would search for strlen within radius of files I was modifying and evaluate to address those uses in the same manner.

Notes:

  • AphrontRequest::getStr only ever returns a string, and is safe to use phutil_nonempty_string.
  • PhabricatorEnv::getEnvConfig can return non-string things so any values coming from there should never use phutil_nonempty_string.
  • AphrontRequest::getHTTPHeader indicates it could return wild so phutil_nonempty_string should not be used.
  • AphrontRequest::getURIData isn't clear if it could return non-string data, so never use phutil_nonempty_string.

Refs T13588

Test Plan

I'm running an instance on 8.2 and went through the basic setup/installation, startup and usage, including setup issues and configurations/settings.

Diff Detail

Repository
rP Phabricator
Branch
cspeck-php8
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 25778
Build 35606: arc lint + arc unit

Event Timeline

cspeckmim held this revision as a draft.
cspeckmim added inline comments.
src/applications/auth/storage/PhabricatorAuthProviderConfig.php
82–83

The relation between AuthProvider and AuthProviderConfig and AuthAdapter confused me a bit

Were these fields migrated at some point? It doesn't look like these fields are ever set or read so maybe they can go away? I think I tried that and ran into a scary sql/database error so I'm guessing Phabricator's DAO does some auto-translating of table column to field. I made a guess that these should probably be set.

89–96

This was another bit I got very mixed up on. These methods appear to be called by PhabricatorAuthEditController::handleRequest (where I was encountering strlen(null)) but these functions didn't exist previously...

src/infrastructure/cluster/PhabricatorDatabaseRefParser.php
6

I was getting a strlen(null) on the host field and after trying to trace back through where it came from it seemed reasonable to include a default host from the same environment/config

90–100

I'm doubting myself on this one. I was running into an issue where I had a local mariadb database configured but for some reason $refs was empty.

src/view/form/control/AphrontFormTokenizerControl.php
70–72

This seemed like a bug as $this->placeholder isn't used anywhere else

cspeckmim retitled this revision from Addressing PHP 8 incompatibilities to Addressing some PHP 8 incompatibilities.May 8 2023, 1:53 AM
src/applications/auth/controller/PhabricatorAuthRegisterController.php
290–291

AphrontRequest::getStr only ever returns a string unless there's some lunacy around passing in a non-string second argument for a default value, so phutil_nonempty_string seems reasonable for any value coming from there (though I believe there are a few cases in this diff where I defaulted to just adding a null-check and could revisit)

src/applications/config/check/PhabricatorStorageSetupCheck.php
147–150

PhabricatorEnv::getEnvConfig can return non-string things so any values coming from there should maybe never use phutil_nonempty_string

src/applications/files/controller/PhabricatorFileDataController.php
71–72

AphrontRequest::getHTTPHeader indicates it could return wild so phutil_nonempty_string should probably not be used.

src/applications/transactions/editengine/PhabricatorEditEngine.php
943–944

AphrontRequest::getURIData I wasn't entirely sure would always return a string so the update for all calls to these only add null-checks.

Be consistent with usage/checks

The relation between AuthProvider and AuthProviderConfig and AuthAdapter confused me a bit

An AuthAdapter is the raw protocol bits that talk to some auth system and give Phabricator a consistent API for interacting with it. These are generic and have no Phabricator-specific code. Most of them subclass OAuth1Adapter or OAuthAdapter and just plug in all the endpoint URIs for Twitch/Facebook/Google/etc.

An AuthProvider represents a Phabricator-specific way for users to authenticate to an install. In most cases, these are pretty much 1:1 with Adapters (e.g., GitHubProvider is highly related to GitHubAdapter). The separation is partly "Phabricator-Specific Providers" vs "Generic Adapters", and partly because some auth (like "Password") is fairly special and has no meaningful "Adapter" behavior. Most of the code in most Providers is instructions for administrators on how to configure stuff.

These classes (Providers and Adapters) could reasonably be merged, and might be if they were written today. Long ago, there was more emphasis on providing as much code as possible in a generic way, but only a handful of people ever used libphutil as a standalone library and I walked away from this a while ago -- the increased installation complexity of Arcanist wasn't worth the handful of people using libphutil as a library.

A Config is just a storage object for a particular Provider an administrator has configured. You can plausibly configure multiple variants of many provider types (e.g., you could authenticate against two different LDAP servers), so there isn't just one "LDAP" config.

Were these fields migrated at some point? It doesn't look like these fields are ever set or read so maybe they can go away?

Phabricator has a lot of "storage objects", which map to some database table. You can identify these objects because they subclass LiskDAO (possibly several layers deep). DAO stands for "Data Access Object" in this context.

Lisk, the storage object subsystem, does a few bits of magic. The main ones are these:

  • All protected properties are mapped to database storage.
  • All protected properties have getters and setters generated automatically.

In this case, PhabricatorAuthProviderConfig subclasses LiskDAO (PhabricatorAuthProviderConfigPhabricatorAuthDAOPhabricatorDAOLiskDAO), so it is a storage object. providerType and providerDomain are protected properties, so they'll be loaded from storage by load() and saved to storage with save(). You can inspect the database table to find them (the table name is based on the class name, column names are property names):

mysql> SHOW CREATE TABLE auth_providerconfig;
+---------------------+-------------------------------------------------------------
| Table               | Create Table                                                
+---------------------+-------------------------------------------------------------
| auth_providerconfig | CREATE TABLE `auth_providerconfig` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `phid` varbinary(64) NOT NULL,
  `providerClass` varchar(128) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
  `providerType` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
  `providerDomain` varchar(128) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
  `isEnabled` tinyint(1) NOT NULL,
  `shouldAllowLogin` tinyint(1) NOT NULL,
  `shouldAllowRegistration` tinyint(1) NOT NULL,
  `shouldAllowLink` tinyint(1) NOT NULL,
  `shouldAllowUnlink` tinyint(1) NOT NULL,
  `shouldTrustEmails` tinyint(1) NOT NULL DEFAULT '0',
  `properties` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
  `dateCreated` int unsigned NOT NULL,
  `dateModified` int unsigned NOT NULL,
  `shouldAutoLogin` tinyint(1) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  UNIQUE KEY `key_phid` (`phid`),
  UNIQUE KEY `key_provider` (`providerType`,`providerDomain`),
  KEY `key_class` (`providerClass`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+---------------------+-------------------------------------------------------------

...and note that these are basically the same fields as the protected properties on the class, plus some extras (id, PHID) added by Lisk:

protected $providerClass;
protected $providerType;
protected $providerDomain;

protected $isEnabled;
protected $shouldAllowLogin         = 0;
protected $shouldAllowRegistration  = 0;
protected $shouldAllowLink          = 0;
protected $shouldAllowUnlink        = 0;
protected $shouldTrustEmails        = 0;
protected $shouldAutoLogin          = 0;

protected $properties = array();

private $provider;

The provider property is private, so it doesn't interact with storage. The id, phid, dateCreated, and dateModified columns are automatic default Lisk behavior. The column type configuration is in getConfiguration():

protected function getConfiguration() {
  return array(
    self::CONFIG_AUX_PHID => true,
    self::CONFIG_SERIALIZATION => array(
      'properties' => self::SERIALIZATION_JSON,
    ),
    self::CONFIG_COLUMN_SCHEMA => array(
      'isEnabled' => 'bool',
      'providerClass' => 'text128',
      'providerType' => 'text32',
      'providerDomain' => 'text128',
      'shouldAllowLogin' => 'bool',
      'shouldAllowRegistration' => 'bool',
      'shouldAllowLink' => 'bool',
      'shouldAllowUnlink' => 'bool',
      'shouldTrustEmails' => 'bool',
      'shouldAutoLogin' => 'bool',
    ),
    self::CONFIG_KEY_SCHEMA => array(
      'key_provider' => array(
        'columns' => array('providerType', 'providerDomain'),
        'unique' => true,
      ),
      'key_class' => array(
        'columns' => array('providerClass'),
      ),
    ),
  ) + parent::getConfiguration();
}

...but note that getConfiguration() does not (normally) "create" columns -- it just adds information (usually type information) to existing columns. The authoritative source of the columns is the list of protected properties on the class.

Since getters and setters are automatically generated and the properties are protected, you'll usually need to look for calls to a getter to identify an unused/unread property. For example, the providerDomain property is accessed via getProviderDomain() (the automatic getter generated by Lisk).

In this specific case, I think providerClass and providerType convey the same information. Using class names is the older way; using type constants is the newer way. Historically, we renamed classes often enough to make using type constants instead of class names seem worth it. I think this got set up for migration, but providerClass hasn't actually been migrated away from. I'd guess you can find some support for this if you dig through blame, but couldn't find a task offhand.

Both providerType/providerClass and providerDomain are actively used (providerDomain isn't meaningful for auth with exactly one valid remote, like GitHub, but is required for anything you can run your own server for, like LDAP, JIRA, etc).


This pretty much all looks fine to me. Per one inline, it would be "nice" to use phutil_nonempty_string() in more cases, but this increases risk a little bit (if some caller is doing some nonsense with setName(567) or whatever, phutil_nonempty_string() will break but just adding a !== null test won't) and the hypothetical upside of making highly-theoretical rework some day marginally easier feels likely-imaginary, so I feel like the case for going back in here and reworking these to use the phutil_nonempty_*() calls is pretty weak.

To make this a bit more manageable, can we do something like this?

  • Pull all the profile menu stuff out into a separate diff, since that seems like the largest single piece. I'll look that over a little more carefully but I think we can just land it as-is.
  • Get rid of the DatabaseRef change from this diff since I think that's either incorrect (and potentially high-impact) or, at least, outside the scope of PHP8 type correctness. If you can recreate the context that lead to it there's probably some less-invasive fix available. I'm fine with "default host", the ref injection just shouldn't be there unless there are bigger problems somewhere else.
  • Leave everything else here and I'll take one more look through it when I'm less tired, but I don't think I caught anything suspicious/fragile/etc.
src/infrastructure/cluster/PhabricatorDatabaseRefParser.php
90–100

Yeah, we shouldn't be injecting the default ref here -- it is built via getLiveIndividualRef() in getAllMasterDatabaseRefs().

This function is just for converting cluster.databases into a list of refs, and sanity checking that the list is plausible.

All the "default host" stuff seems fine in theory, but I'd like a clearer understanding of what the underlying issue is.

src/view/form/control/AphrontFormTokenizerControl.php
70–72

Agreed this was a bug, we probably just never use a placeholder other than the default placeholder in a Tokenizer anywhere.

src/view/layout/AphrontSideNavFilterView.php
100

In this case, the "right" test is probably just a test for null. Any code passing the empty string to mean "no icon" is guilty of sloppy typehandling and not using the API correctly.

However, there are a zillion callsites for this and it's impossible to test thoroughly, and the value of being strict and correct seems low.

I'd encourage using phutil_nonempty_string($icon) in these cases:

  • it fixes the PHP 8 fatal;
  • it very slightly increases type safety in an almost-certainly-unproblematic way (locks $icon down to only null or string -- this could catch some sloppy callers passing integers or whatever, but it's likely anything it catches was an Actual Bug);
  • it leaves a bit of a hint later that we took the easy path here.

But this change is also fine as-is.

src/view/widget/AphrontStackTraceView.php
33

Should probably be strict test (!== null).

This revision now requires changes to proceed.May 8 2023, 5:07 AM
src/view/layout/AphrontSideNavFilterView.php
100

I originally had a lot of type-marking changes throughout several of the Aphront classes but then ended up running into the issue where stringified values were being passed around. I wasn't sure if $icon was allowed to be something like PHUIIconView or something similar.

I think I had run across a setDescriptionText() function which I made the assumption that "Text" meant "string" but it was also being passed UI objects in some cases.

Made suggested changes except I haven't pulled out the profile menu stuff yet.

I also tried setting up a notification server and added a few more fixes.

Thank you for the thorough explanation. Yesterday I tried searching for how to dynamically add functions to classes but apparently my searching today is better~

I'll look at splitting out all the profile menu stuff into a separate diff.

src/infrastructure/cluster/PhabricatorDatabaseRef.php
232

This was the strlen(null) that drove me down the database ref/parser to find out why a null port was being passed around. I'm not sure how I got turned around into providing a "default" database entry in the cluster mix.

Remove changes to profile menu files

I jumped the gun and updated my local arcanist to using php 8.2 which introduced the UT errors

.gitignore
39–40 ↗(On Diff #52115)

Running npm install ws produces these two files. The package.json file we should maybe include in the repository (install command might then be npm build or something?). Ignoring both seemed fine for now but I can revert this from this diff.

My session expired and ran into a few more issues

Fixed another issue with user log entries

I didn't catch anything that looks suspicious or hazardous. Thanks!

.gitignore
39–40 ↗(On Diff #52115)

Seems fine/harmless.

src/applications/config/check/PhabricatorStorageSetupCheck.php
147–150

We could also just make the config stricter (e.g., raise a setup error if you set amazon-s3.region to 3.1415).

I think this has happened naturally over time to some degree -- a theoretical future change is to turn file storage services into a list (like databases, mailers, etc) and the list-configs generally get strict type checks in initial implementation (via PhutilTypeSpec::checkMap(...)).

This revision is now accepted and ready to land.May 24 2023, 4:35 PM
cspeckmim edited the test plan for this revision. (Show Details)

Updating to verify before landing that I haven't included further changes

src/applications/config/check/PhabricatorStorageSetupCheck.php
147–150

I'll make note to review this!

jmeador added inline comments.
src/applications/notification/client/PhabricatorNotificationServerRef.php
139–184 ↗(On Diff #52129)

After upgrading, changes in this file cause my client URI to now become /ws instead of the intended /ws/. The trailing slash is missing. Note that my version of PHP is 7.2.24 (RHEL SCL).

src/applications/notification/client/PhabricatorNotificationServerRef.php
147–159 ↗(On Diff #52129)

This edit fixes the problem for me.

Your suggested patch looks correct to me, insofar as I recall what a computer looks like. I'll see about getting it upstream.

I'll see about getting it upstream.

I upstreamed this (and a couple other things I caught) in D21875. Thanks!