Page MenuHomePhabricator

Port the Java fragment lexer to PHP
ClosedPublic

Authored by epriestley on Sep 4 2018, 9:01 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Apr 19, 7:55 PM
Unknown Object (File)
Sat, Mar 30, 6:39 AM
Unknown Object (File)
Mar 5 2024, 6:38 AM
Unknown Object (File)
Feb 15 2024, 6:51 PM
Unknown Object (File)
Feb 3 2024, 7:22 PM
Unknown Object (File)
Jan 4 2024, 2:56 AM
Unknown Object (File)
Dec 27 2023, 12:51 PM
Unknown Object (File)
Dec 27 2023, 12:51 PM
Subscribers
None

Details

Summary

Ref T13195. Ref T3130. Ports the lexer for Java code from Python to PHP.

This isn't 100% faithful to the behavior in Pygments, but pretty close. We can improve it as the need arises.

Test Plan

Screen Shot 2018-09-04 at 1.55.08 PM.png (1×858 px, 114 KB)

Diff Detail

Repository
rPHU libphutil
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

  • Fix escaping of backslashes in string rules.

The major motivation here is to avoid the python startup overhead when highlighting small snippets. Highlighting one line of Java by calling out to pygmentize takes a minimum of ~60ms on my machine, compared to <1ms with a PHP lexer.

We're also slightly faster on very large files from what I can tell (locally, ~200ms for PHP vs ~300ms for Python for a 3K line file I pulled out of ElasticSearch, both with full startup costs) but the differences are most stark for many small files (like highlighting a lot of little snippets for rendering inline comments into mail).

Assuming most of this came from the Pygments source, can you link to that file from this revision for reference?

src/lexer/PhutilJavaFragmentLexer.php
42

TIL the strictfp declaration in Java.

74–121

I'm not even going to pretend to review the accuracy of these regexes as long as a test snippet looks reasonable.

This revision is now accepted and ready to land.Sep 5 2018, 10:39 PM
This revision was automatically updated to reflect the committed changes.