Ref T7984. With this, an install can add an ExternalSymbolsSource to src/extensions, which will include whatever
source they have.
Details
- Reviewers
joshuaspence epriestley - Group Reviewers
Blessed Reviewers - Maniphest Tasks
- T7984: Symbols: support for External Symbols in other languages
- Commits
- Restricted Diffusion Commit
rP8ea13f3ce934: Framework for external symbol search
search for php and python builtins.
Diff Detail
- Repository
- rP Phabricator
- Branch
- externalsymbols
- Lint
Lint Passed - Unit
Tests Passed - Build Status
Buildable 6332 Build 6354: [Placeholder Plan] Wait for 30 Seconds
Event Timeline
What do you think about generalizing this a little bit?
- Instead of returning just a URI, return a list of PhabricatorRepositorySymbol objects.
- Give PhabricatorRepositorySymbol some flags to indicate it's external and have a custom URI. (You can also use makeEphemeral() to prevent objects from being save()'d.)
I like that a little better because:
- We don't need priority on sources.
- We can satisfy queries more generally (e.g., did you mean "function idx() in PHP or class idx in Python"?).
- We already have logic to show the user multiple results -- the existing "if there are no results and we're jumping, check for a PHP external" is pretty hacky.
- We can put these in typeaheads or whatever later on without them being weird.
So maybe ExternalSymbolSource becomes DiffusionExternalSymbolQuery and has methods similar to DiffusionSymbolQuery? (withLanguages(), withTypes(), etc.)? I think that's not much more complex but gives us a lot more flexibility.
src/applications/diffusion/symbol/ExternalSymbolsSource.php | ||
---|---|---|
3 ↗ | (On Diff #31430) | We should scope this as DiffusionExternalSymbolSource. |
That all sounds reasonable.
I was thinking to eventually have ExternalSymbolSource go in the "Use Sources From" field in repos, although that's probably a bit far, and only relevant if there's a large number of sources, and any install with enough of them would want to have a better indexing+search tool anyway.
src/applications/diffusion/symbol/ExternalSymbolsSource.php | ||
---|---|---|
6 ↗ | (On Diff #31430) | Can we make this explicit by using withName, withLanguage, withType and withContext instead? That would fail nicer if you typo one of the keys. |
src/applications/diffusion/symbol/PhpExternalSymbolsSource.php | ||
7 | Maybe this class should provide a getSupportedLanguages() method | |
15 | This won't work if the symbol is from a later version of PHP than the Phabricator installation is using. It would be better if we could read this from a file like php_compat_info.json. | |
22 | As above. | |
src/applications/diffusion/symbol/PythonExternalSymbolsSource.php | ||
16 | Typo | |
17 | Does the "2" here mean Python 2? If so, do we need to allow for other versions? | |
22 | Typo. Also, maybe move this to a JSON file for consistency with PHP symbols. |
src/applications/diffusion/symbol/PhpExternalSymbolsSource.php | ||
---|---|---|
15 | I might punt on this for a day or two while we finalize the framework first. | |
src/applications/diffusion/symbol/PythonExternalSymbolsSource.php | ||
17 | Yes. I was thinking to use priority to solve this, but with the new suggested design, I'll just make this class return both links (Or maybe one link in cases where the docs are the same). |
It also shouldn't be too hard to do the "Use Sources From" thing sooner rather than later if you want, and that might be a good approach for the Python2 vs Python3 issue. I can shoot you a skeleton for the Tokenizer/Datasource magic after this lands if you want to pursue it in the near-ish term.
src/applications/diffusion/controller/DiffusionSymbolController.php | ||
---|---|---|
70 | I wanted to explicitly have the indexed results first, but maybe the built-ins should go first, in case we have lots of the indexed ones? | |
src/applications/diffusion/symbol/DiffusionExternalSymbolsSource.php | ||
11–28 | I'll replace this with protected function buildExternalSymbol() { return id(new PhabricatorRepositorySymbol()) ->setIsExternal(true) ->makeEphemeral(); } | |
src/applications/diffusion/symbol/PythonExternalSymbolsSource.php | ||
23 | Fixed the typo (I hope), but haven't moved the content to a file yet. |
mm... #inline-45560 on DiffusionExternalSymbolsSource.php didn't get ported to the next diff.
Some idle blather inline but I didn't catch anything really substantive. This looks great to me. Deal with however much of that makes sense and I think this is good to go.
src/applications/diffusion/controller/DiffusionSymbolController.php | ||
---|---|---|
70 | Putting indexed results first feels like it's probably the better behavior to me, too. If you have 100+ results the query and/or the way you're building or configuring your symbols probably aren't great anyway. | |
src/applications/diffusion/symbol/DiffusionExternalSymbolsSource.php | ||
4 | You could also just put these methods on DiffusionExternalSymbolQuery and have the sources subclass that, although I'm not really sure if that's cleaner/simpler. This design feels about equally good to me. | |
9 | Maybe "lookupSymbols" since we return a list of matches now. Or executeQuery()? | |
src/applications/diffusion/symbol/PhpExternalSymbolsSource.php | ||
3 | For consistency, prefer Diffusion... at the beginning of the classname. | |
7–9 | Maybe simpler with a convenience method on the Query, something like: $query->includesAnyOfTheseLanguages($list_of_languages)? Presumably all or almost all external sources will do this as their first check. | |
24 | I guess arguably the source should be pht()'d too since it's human-readable text, even though the translation is presumably always "PHP". | |
25 | pht() | |
26 | These should possibly be "php" and "py" or "python" (approximately, file extensions)? I guess it doesn't really matter now, but in theory whatever we set here should be some machine-readable constant representing the language, and we'd like to be able to go from "X.java" -> language constant -> look up symbols matching that language constant. The mapping from "X.java" to language constants is currently very crude, I think: we just take the file extension. But maybe we do slightly better than that today? Some day we should have a formal way to do this -- it's just complicated because a lot of the mapping rules we use in practice are implicit rules inside of Pygments (see T3626 for some discussion). We could put a "language constant" -> "human language name" mapping at the display level so things could read "PHP" and "Python" instead of "php" and "python" to clean the UI up. We also might need to eventually encode language variants, to deal with the python2 vs python3 case and the Zend/HHVM/Hack case and whatever else. | |
39 | pht() | |
src/applications/diffusion/symbol/PythonExternalSymbolsSource.php | ||
15 | Naively, I'd expect these to be function, not builtin? Is this just a python-ism? Today, I think it doesn't matter. In the future, we might eventually care about having fewer distinct symbol types, maybe. |
src/applications/diffusion/symbol/PythonExternalSymbolsSource.php | ||
---|---|---|
15 | I think I've invented builtin in the ctrl-click diff. The python lexer marks all of these as nb (Name.Builtin), and under the existing state we can't search for them. We do the Python lexing ourselves, so we can just say they are all nf and stop worrying about it. |
src/applications/diffusion/symbol/DiffusionExternalSymbolsSource.php | ||
---|---|---|
4 | I'd rather keep Query separate from the Source/Engine. | |
9 | executeQuery sounds good. | |
src/applications/diffusion/symbol/PhpExternalSymbolsSource.php | ||
15 | I've filed this as T8349 | |
26 | I agree we could do better with language detection, but for now just using the common file extension would probably be good. | |
src/applications/diffusion/symbol/PythonExternalSymbolsSource.php | ||
15 | just support both for now :) |