Changeset View
Standalone View
src/parser/PhutilEditorConfig.php
- This file was added.
<?php | |||||
/** | |||||
* Utilities for parsing [[http://editorconfig.org/ | EditorConfig]] files. | |||||
*/ | |||||
final class PhutilEditorConfig { | |||||
private $configs; | |||||
private $rootPath; | |||||
/** | |||||
* Constructor. | |||||
* | |||||
* @param string The root directory. | |||||
*/ | |||||
public function __construct($root) { | |||||
$this->rootPath = $root; | |||||
$paths = id(new FileFinder($root)) | |||||
->withType('f') | |||||
->withName('.editorconfig') | |||||
->find(); | |||||
epriestley: One vague concern with this is that issuing `find` against a large working copy can be slow. | |||||
joshuaspenceAuthorUnsubmitted Not Done Inline ActionsYeah, I don't like the idea of only looking for $root/.editorconfig. joshuaspence: Yeah, I don't like the idea of only looking for `$root/.editorconfig`. | |||||
foreach ($paths as $path) { | |||||
$this->addEditorConfigFile(Filesystem::resolvePath($path, $root)); | |||||
} | |||||
} | |||||
protected function addEditorConfigFile($path) { | |||||
$file = Filesystem::readFile($path); | |||||
$config = phutil_ini_decode($file); | |||||
unset($config['root']); | |||||
$this->validateEditorConfig($config); | |||||
$path = dirname($path); | |||||
$this->configs[$path] = array(); | |||||
foreach ($config as $key => $value) { | |||||
if (preg_match('@/@', $key)) { | |||||
$this->configs[$path][phutil_glob_to_regex('**/'.$key)] = $value; | |||||
} else if (preg_match('@^/@', $key)) { | |||||
$this->configs[$path][phutil_glob_to_regex(substr($key, 1))] = $value; | |||||
} else { | |||||
$this->configs[$path][phutil_glob_to_regex($key)] = $value; | |||||
} | |||||
joshuaspenceAuthorUnsubmitted Not Done Inline ActionsI did this a while ago so I'm not 100% sure where it comes from but I am pretty sure this logic was copied from editorconfig-core-js. joshuaspence: I did this a while ago so I'm not 100% sure where it comes from but I am pretty sure this logic… | |||||
} | |||||
} | |||||
public static function validateEditorConfig(array $config) { | |||||
$spec = PhutilTypeSpec::newFromString('map<string, map<string, string>>'); | |||||
$spec->check($config); | |||||
joshuaspenceAuthorUnsubmitted Not Done Inline ActionsI'd also like to check here that the keys are valid, although this could get messy. joshuaspence: I'd also like to check here that the keys are valid, although this could get messy. | |||||
epriestleyUnsubmitted Not Done Inline ActionsYeah, definitely feels messy. epriestley: Yeah, definitely feels messy. | |||||
joshuaspenceAuthorUnsubmitted Not Done Inline ActionsI think that (possibly) a better approach would be to extract values that we understand from the .editorconfig and discard anything else. This means that we can safely support a subset of EditorConfig. joshuaspence: I think that (possibly) a better approach would be to extract values that we understand from… | |||||
} | |||||
/** | |||||
* Return the paths of all EditorConfig files that were found. | |||||
* | |||||
* @return list<string> | |||||
*/ | |||||
public function getPaths() { | |||||
$paths = array(); | |||||
foreach (array_keys($this->configs) as $path) { | |||||
$paths[] = $path.'/.editorconfig'; | |||||
} | |||||
return $paths; | |||||
} | |||||
/** | |||||
* Return the root path. | |||||
* | |||||
* @return string | |||||
*/ | |||||
public function getRootPath() { | |||||
return $this->rootPath; | |||||
} | |||||
public function getConfig($path, $key) { | |||||
foreach ($this->configs as $config_path => $config) { | |||||
if (Filesystem::isDescendant($path, $config_path)) { | |||||
epriestleyUnsubmitted Not Done Inline ActionsThis is O(N^2) if you run, e.g., arc lint --everything, I think, since we have to iterate over each path in the repository for each path we're examining. It's probably worth the complexity cost to build a tree instead, by splitting the paths apart into components: src applications differential => array(...) diffusion => array(...) Then split the path being looked up apart, and walk down the tree. epriestley: This is `O(N^2)` if you run, e.g., `arc lint --everything`, I think, since we have to iterate… | |||||
joshuaspenceAuthorUnsubmitted Not Done Inline ActionsYeah, this seems better. joshuaspence: Yeah, this seems better. | |||||
foreach ($config as $config_key => $config_value) { | |||||
} | |||||
} | |||||
} | |||||
} | |||||
} | |||||
Not Done Inline ActionsI'd expect this not to work -- I think switch in PHP just does ==, so the case 0: is unreachable. $ cat test.php <?php foreach (array(0, false) as $test) { switch ($test) { case 0: echo "Zero.\n"; break; case false: echo "False.\n"; break; } } $ php -f test.php Zero. Zero. epriestley: I'd expect this not to work -- I think `switch` in PHP just does `==`, so the `case 0:` is… |
One vague concern with this is that issuing find against a large working copy can be slow. Particularly, Facebook has very large working copies (many GB) with huge numbers of files, and has spent a great deal of engineering effort attempting to mitigate the costs of this.
One cheap way to avoid paying this cost (at the cost of accuracy) would be to check for $root/.editorconfig before searching, under the theory that any project which uses these files will have a default at the root. I'm not sure if this is a reasonable assumption or not.
We could also fill the cache lazily (maybe falling back to a full fill at, say, 2048 entries, under the assumption that if we're doing anything with more than 2048 paths we're probably doing something with the whole repository) but this is significantly complex.
We could also let installs enumerate the locations of their .editorconfig files elsewhere in configuration, I guess.
(I don't think any of this is worth pursuing for now.)
Once this works, though, it would be nice to capture tab width information when sending source to Phabricator, which implies we might start running it all the time.