See T9566 and D14273. It looks like there may be something in the realm of a 5-10x performance improvement for at least some subset of inputs available by upgrading from old Pygments (circa 1.4) to newer Pygments (circa 2.0.1). No particular urgency here, but we have reasonable tools for raising advisory setup issues now and this one should be pretty easy to detect. We already do other pygmentize checks anyway.
Description
Revisions and Commits
Related Objects
- Mentioned In
- Z1336: General Chat
D14297: Switching to pygmentize -V
Blog Post: Development Notes (2015 Week 42)
D14273: Skip pygmentize for large source and too long lines
T9566: Timeouts when highlighting source with very long lines using pygmentize - Mentioned Here
- T9566: Timeouts when highlighting source with very long lines using pygmentize
D14273: Skip pygmentize for large source and too long lines
Event Timeline
Sure:
- Make these changes in PhabricatorPygmentSetupCheck.
- Instead of running pygmentize -h to test that Pygments works, run pygmentize -V. This will let us test that the binary works, but also let us check the version.
- After doing the existing "Does pygmentize work?" check, try to parse the version:
- If we can't figure out what version it is, assume it's a future version of Pygments that changed the format of the version string and do nothing.
- If the version is 2.0.0 or newer (you can use version_compare() to check), do nothing.
- If the version is older than 2.0.0, recommend upgrading to a more recent version of Pygments to get support for more languages and improve performance. Advise the user that they can safely ignore this warning if they don't want to upgrade.
The version string for 2.0.1 is:
$ pygmentize -V Pygments version 2.0.1, (c) 2006-2014 by Georg Brandl.
The version string for 1.4 is:
$ pygmentize -V Pygments version 1.4, (c) 2006-2008 by Georg Brandl.
You could try to verify that the same format for other versions, or walk through the line history in the Pygments upstream, or reasonably just assume that the format string was probably similar for all reasonable versions we care about.
From D14273, @gd also got a substantial performance improvement for long input lines applied to the Pygments upstream:
Possibly we should wait until that makes it to a release and specifically recommend the release containing that fix ("Upgrade to 2.0.5 or newer...", or whatever the next release number is) since it's probably the highest-impact user-facing change between 1.4 and today for most users, albeit only for a subset of inputs.
Is it possible to make 4 messages (not installed - old v1 - medium v2 - recent v2). Or this is too much complexity ?
Anyway, I will start this and then you put this on hold till Pygments release the new version, if you want ! Is it ok ?