Page MenuHomePhabricator

Framework for external symbol search
ClosedPublic

Authored by avivey on May 27 2015, 4:54 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Nov 17, 1:38 PM
Unknown Object (File)
Tue, Nov 15, 8:46 PM
Unknown Object (File)
Mon, Nov 14, 6:14 AM
Unknown Object (File)
Mon, Nov 14, 2:20 AM
Unknown Object (File)
Sun, Nov 13, 5:46 PM
Unknown Object (File)
Sun, Nov 13, 2:10 PM
Unknown Object (File)
Thu, Nov 10, 2:06 PM
Unknown Object (File)
Wed, Nov 9, 1:45 PM
Subscribers
Tokens
"Orange Medal" token, awarded by epriestley.

Details

Summary

Ref T7984. With this, an install can add an ExternalSymbolsSource to src/extensions, which will include whatever
source they have.

Test Plan

search for php and python builtins.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

avivey retitled this revision from to Framework for external symbol search.
avivey edited the test plan for this revision. (Show Details)
avivey added reviewers: epriestley, joshuaspence.
avivey updated this object.
avivey edited edge metadata.

fixed lint

epriestley edited edge metadata.

What do you think about generalizing this a little bit?

  • Instead of returning just a URI, return a list of PhabricatorRepositorySymbol objects.
  • Give PhabricatorRepositorySymbol some flags to indicate it's external and have a custom URI. (You can also use makeEphemeral() to prevent objects from being save()'d.)

I like that a little better because:

  • We don't need priority on sources.
  • We can satisfy queries more generally (e.g., did you mean "function idx() in PHP or class idx in Python"?).
  • We already have logic to show the user multiple results -- the existing "if there are no results and we're jumping, check for a PHP external" is pretty hacky.
  • We can put these in typeaheads or whatever later on without them being weird.

So maybe ExternalSymbolSource becomes DiffusionExternalSymbolQuery and has methods similar to DiffusionSymbolQuery? (withLanguages(), withTypes(), etc.)? I think that's not much more complex but gives us a lot more flexibility.

src/applications/diffusion/symbol/ExternalSymbolsSource.php
3 ↗(On Diff #31430)

We should scope this as DiffusionExternalSymbolSource.

This revision now requires changes to proceed.May 27 2015, 4:54 PM

That all sounds reasonable.

I was thinking to eventually have ExternalSymbolSource go in the "Use Sources From" field in repos, although that's probably a bit far, and only relevant if there's a large number of sources, and any install with enough of them would want to have a better indexing+search tool anyway.

joshuaspence edited edge metadata.
joshuaspence added inline comments.
src/applications/diffusion/symbol/ExternalSymbolsSource.php
6 ↗(On Diff #31430)

Can we make this explicit by using withName, withLanguage, withType and withContext instead? That would fail nicer if you typo one of the keys.

src/applications/diffusion/symbol/PhpExternalSymbolsSource.php
6 ↗(On Diff #31430)

Maybe this class should provide a getSupportedLanguages() method

14 ↗(On Diff #31430)

This won't work if the symbol is from a later version of PHP than the Phabricator installation is using. It would be better if we could read this from a file like php_compat_info.json.

21 ↗(On Diff #31430)

As above.

src/applications/diffusion/symbol/PythonExternalSymbolsSource.php
15 ↗(On Diff #31430)

Typo

16 ↗(On Diff #31430)

Does the "2" here mean Python 2? If so, do we need to allow for other versions?

21 ↗(On Diff #31430)

Typo. Also, maybe move this to a JSON file for consistency with PHP symbols.

What do you think about generalizing this a little bit?

  • Instead of returning just a URI, return a list of PhabricatorRepositorySymbol objects.
  • Give PhabricatorRepositorySymbol some flags to indicate it's external and have a custom URI. (You can also use makeEphemeral() to prevent objects from being save()'d.)

I like that a little better because:

  • We don't need priority on sources.
  • We can satisfy queries more generally (e.g., did you mean "function idx() in PHP or class idx in Python"?).
  • We already have logic to show the user multiple results -- the existing "if there are no results and we're jumping, check for a PHP external" is pretty hacky.
  • We can put these in typeaheads or whatever later on without them being weird.

So maybe ExternalSymbolSource becomes DiffusionExternalSymbolQuery and has methods similar to DiffusionSymbolQuery? (withLanguages(), withTypes(), etc.)? I think that's not much more complex but gives us a lot more flexibility.

+1 on all of this.

src/applications/diffusion/symbol/PhpExternalSymbolsSource.php
14 ↗(On Diff #31430)

I might punt on this for a day or two while we finalize the framework first.

src/applications/diffusion/symbol/PythonExternalSymbolsSource.php
16 ↗(On Diff #31430)

Yes. I was thinking to use priority to solve this, but with the new suggested design, I'll just make this class return both links (Or maybe one link in cases where the docs are the same).

It also shouldn't be too hard to do the "Use Sources From" thing sooner rather than later if you want, and that might be a good approach for the Python2 vs Python3 issue. I can shoot you a skeleton for the Tokenizer/Datasource magic after this lands if you want to pursue it in the near-ish term.

avivey edited edge metadata.

Make exploration query-like, return multiple results, always search all sources.

pasted_file (181×714 px, 20 KB)

pasted_file (428×852 px, 69 KB)

src/applications/diffusion/controller/DiffusionSymbolController.php
70

I wanted to explicitly have the indexed results first, but maybe the built-ins should go first, in case we have lots of the indexed ones?

src/applications/diffusion/symbol/DiffusionExternalSymbolsSource.php
11–28

I'll replace this with

protected function buildExternalSymbol() {
  return id(new PhabricatorRepositorySymbol())
      ->setIsExternal(true)
      ->makeEphemeral();
}
src/applications/diffusion/symbol/PythonExternalSymbolsSource.php
22 ↗(On Diff #31479)

Fixed the typo (I hope), but haven't moved the content to a file yet.

(The first example is out-of-date - it will only return 1 result for python now).

Fix jump-for-externals.

avivey edited edge metadata.
  • fix jump and helper method
epriestley edited edge metadata.

Some idle blather inline but I didn't catch anything really substantive. This looks great to me. Deal with however much of that makes sense and I think this is good to go.

src/applications/diffusion/controller/DiffusionSymbolController.php
70

Putting indexed results first feels like it's probably the better behavior to me, too.

If you have 100+ results the query and/or the way you're building or configuring your symbols probably aren't great anyway.

src/applications/diffusion/symbol/DiffusionExternalSymbolsSource.php
4

You could also just put these methods on DiffusionExternalSymbolQuery and have the sources subclass that, although I'm not really sure if that's cleaner/simpler. This design feels about equally good to me.

9

Maybe "lookupSymbols" since we return a list of matches now. Or executeQuery()?

src/applications/diffusion/symbol/PhpExternalSymbolsSource.php
3 ↗(On Diff #31480)

For consistency, prefer Diffusion... at the beginning of the classname.

7–9 ↗(On Diff #31480)

Maybe simpler with a convenience method on the Query, something like: $query->includesAnyOfTheseLanguages($list_of_languages)? Presumably all or almost all external sources will do this as their first check.

24 ↗(On Diff #31480)

I guess arguably the source should be pht()'d too since it's human-readable text, even though the translation is presumably always "PHP".

25 ↗(On Diff #31480)

pht()

26 ↗(On Diff #31480)

These should possibly be "php" and "py" or "python" (approximately, file extensions)?

I guess it doesn't really matter now, but in theory whatever we set here should be some machine-readable constant representing the language, and we'd like to be able to go from "X.java" -> language constant -> look up symbols matching that language constant.

The mapping from "X.java" to language constants is currently very crude, I think: we just take the file extension. But maybe we do slightly better than that today? Some day we should have a formal way to do this -- it's just complicated because a lot of the mapping rules we use in practice are implicit rules inside of Pygments (see T3626 for some discussion).

We could put a "language constant" -> "human language name" mapping at the display level so things could read "PHP" and "Python" instead of "php" and "python" to clean the UI up.

We also might need to eventually encode language variants, to deal with the python2 vs python3 case and the Zend/HHVM/Hack case and whatever else.

39 ↗(On Diff #31480)

pht()

src/applications/diffusion/symbol/PythonExternalSymbolsSource.php
15 ↗(On Diff #31480)

Naively, I'd expect these to be function, not builtin? Is this just a python-ism?

Today, I think it doesn't matter. In the future, we might eventually care about having fewer distinct symbol types, maybe.

This revision now requires changes to proceed.May 28 2015, 2:53 PM
src/applications/diffusion/symbol/PythonExternalSymbolsSource.php
15 ↗(On Diff #31480)

I think I've invented builtin in the ctrl-click diff. The python lexer marks all of these as nb (Name.Builtin), and under the existing state we can't search for them. We do the Python lexing ourselves, so we can just say they are all nf and stop worrying about it.

avivey edited edge metadata.
avivey marked 8 inline comments as done.
  • rename classes, methods
  • matchesAnyLanguage, matchesAnyType
  • pht
  • language names
avivey added inline comments.
src/applications/diffusion/symbol/DiffusionExternalSymbolsSource.php
5

I'd rather keep Query separate from the Source/Engine.

10

executeQuery sounds good.

src/applications/diffusion/symbol/PhpExternalSymbolsSource.php
15 ↗(On Diff #31480)

I've filed this as T8349

26 ↗(On Diff #31480)

I agree we could do better with language detection, but for now just using the common file extension would probably be good.

src/applications/diffusion/symbol/PythonExternalSymbolsSource.php
15 ↗(On Diff #31480)

just support both for now :)

epriestley edited edge metadata.
This revision was automatically updated to reflect the committed changes.