Page MenuHomePhabricator

Use a server for pygments?
Closed, ResolvedPublic

Description

I feel like I've talked about this before, but I can't find an issue referencing it so I'm creating a new one.

We've turned off pygments on our local phab install because it's too CPU intensive to use, mostly because it spawns a new process for every file it has to colorize.

While I haven't done the work to test it, I'm guessing that pygments would be performant (enough) if we could use it as a server. And today I ran across such a server:
https://bitbucket.org/hhsprings/pygmentize_simplehttpserver/

Would there be interest in adding support of this to phabricator? I'm thinking it would be a config option to use the server rather than the binary (and you'd perhaps configure it with the port to talk to), and if you use that config option it's your responsibility to make sure that the server is running on the phabricator machine.

While I don't know how phab uses pygments, I'm hoping it would be a pretty straightforward change that just replaces a system() call with an http request.

Event Timeline

csilvers created this task.Sep 7 2017, 6:29 AM

(As a side note, the way we'd support this on our ubuntu phabricator machine is to write an upstart script to start the pygments server on startup -- it would also restart the server if it died -- and to run the server under pypy for even more performance happiness.)

The only way forward on Pygments performance with upstream support is porting language lexers to PHP. Writing, deploying, and supporting an additional service is an enormous amount of work for us and not a solution I'm interested in pursuing for this problem: it burdens us with a large ongoing support cost instead of very small mostly-one-time lexer porting costs.

Some possible ways forward:

  • If writing, deploying, and maintaining this server doensn't seem like much work to you (and it may reasonably not be), write this server yourself and replace the pygmentize binary on your system with a lightweight client written in a language with a fast startup time that just calls through to the server.
  • Or, run the server and instead of writing a client, modify PhutilPygmentsSyntaxHighlighter yourself to use HTTPSFuture instead of ExecFuture and maintain a small local patch (I think this is probably about 10-20 lines of changes).
  • You can purchase a support contract and request support for language lexers in PHP, then we'll port them for you.

Writing, deploying, and supporting an additional service is an enormous amount of work for us and not a solution I'm interested in pursuing for this problem

Totally understood, which is why I framed the proposal the way I did: 1) the server is already written, 2) this alternative is only activated via a config option, and 3) the design of the config option makes it clear (I hope) that the pygments server is outside the phabricator world and thus the client is responsible for maintaining it. Of course, there still could be a support cost associated with it. I don't know if phabricator has -- or wants -- an idea of an "unsupported" configuration option.

replace the pygmentize binary on your system with a lightweight client written in a language with a fast startup time that just calls through to the server
modify PhutilPygmentsSyntaxHighlighter yourself to use HTTPSFuture instead of ExecFuture

An intermediate option is to modify the call to ExecFuture to call curl directly, instead of going through a pygmentize wrapper.

It's very possible we'll either modify PhutilPygmentsSyntaxHighlighter as you suggest, or port a few languages to php ourselves. It sounds like you'd be interested in us forwarding the patch to you guys in the latter case, but not the former?

Yeah -- we'll take the patch in the latter case (porting lexers to PHP) but may not have time to review them for a while. We aren't interested in any version of options/support for using HTTP to talk to Pygments in the upstream.

I'd be open to properly modularizing PhutilDefaultSyntaxHighlighterEngine->getHighlightFuture() to use modern modular/extension patterns so a third-party library could provide a HTTPBasedPygmentsLexer or similar, but only in the context of a support contract.

epriestley closed this task as Resolved.Sep 7 2017, 8:43 PM
epriestley claimed this task.

Calling this resolved since there's no remaining upstream action, feel free to send diffs for lexers if you build them.

If writing, deploying, and maintaining this server doensn't seem like much work to you (and it may reasonably not be), write this server yourself and replace the pygmentize binary on your system with a lightweight client written in a language with a fast startup time that just calls through to the server.

This is just FYI, but I implemented such a setup. It took about half a day, so not too bad.

The result is at https://github.com/Khan/pygments-server. The README has instructions for getting it to work with phabricator.

If you like, you could point people toward this, with an explicit mention that there is no Phacility support for it (they are welcome to file bugs/questions on the github issue page). Or not. Up to you.

Perhaps another approach that could be considered, doing the colorizing in the browser using something like highlight.js (https://highlightjs.org/). This, of course, puts the CPU burden on the end user instead of the server.