This appears to be ~1.5-2.0x faster than pygments on my machine, which isn't as good as I was hoping for but will help. (This is ~330 ms for a 4400-line file I chose at random.)
Details
- Reviewers
epriestley - Group Reviewers
Blessed Reviewers - Commits
- rPHU9b2f35480dc0: Implementation of PhutilLexer for Python
Loaded a Python paste and saw highlights that approximately (or maybe exactly) match the pygments version.
Diff Detail
- Repository
- rPHU libphutil
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
src/lexer/PhutilPythonFragmentLexer.php | ||
---|---|---|
2 | test comment, please ignore |
This stuff is pretty hard to review comprehensively, but we can tweak it and start adding unit tests or whatever if there are issues. It looks structurally correct to me, and I didn't catch anything suspicious looking.
Thanks for putting this together!
src/lexer/PhutilPythonFragmentLexer.php | ||
---|---|---|
219 | I was just looking over this due to a recent mention in a chat, and while I don't understand what's going on here, I can't help but wonder if this should be \\\\" instead of \\\\\'. |
I'm not entirely sure how to test the rule, but the upstream seems to use ":
If anyone can show me a Python file where this matters, I'm happy to fix it and put some test coverage on it.
This was my best guess:
print r"Tom \"Maverick\" Cruise" print r'Tom \'Maverick\' Cruise'
...but neither string highlights specially and the runtime behavior just makes me confused about why Python has this feature:
$ python maverick.py Tom \"Maverick\" Cruise Tom \'Maverick\' Cruise
...what? Why?
Try taking out the r prefix. It's for "raw strings", in which backslashes are interpreted literally. This is mostly used for regexps (python does not support /.../ strings, which are how "raw" strings are implemented in most languages).
Oh, my assumption was that the raw part was important because of the comment ("// included here for raw strings").
Non-raw strings seem to work correctly without changes (see inline about why):
src/lexer/PhutilPythonFragmentLexer.php | ||
---|---|---|
306 | Additional normal escaping rules get merged in here, and seem to handle things in normal strings. |
OK, as I say I have no idea what the line in question is doing. But I do think it's probably wrong. It may just be wrong in a way that doesn't matter.
Yeah, I think it's wrong but I'm not sure if the right version is to use a " or to remove it completely, since I can't find an input which it does anything for.