We don't detect the language of files named "Makefile" correctly. Pygments does; it looks like the rule Pygments uses is "if there is no dot in the filename, use the whole filename", since the string "makefile" does not appear in the Pygments source so I believe this isn't a special case.
However, our logic currently looks like this:
$lang = detect_from_filename($filename); if ($lang === null) { $lang = detect_from_content($data); }
If we add the "use the whole filename", we'll always get a language out of detect_from_filename(), so we'll never invoke the data-based language guesser. The logic needs to look like this instead:
$lang = detect_from_filename($filename); if (!is_valid_language($lang)) { $data_lang = detect_from_content($data); if (is_valid_langauge($data_lang)) { $lang = $data_lang; } }
Implementing is_valid_language() is a bit tricky. A language is valid if we support it explicitly, or if Pygments supports it. We can get a list of langauges Pygments supports with pygmentize -L lexers, but we don't want to run or parse this every time. The list may change based on Pygments versions, too.
I think the simplest approach is:
- Just copy-paste the list out of modern Pygments into libphutil, and we'll update it once a year or whatever (as new languages are created).
- Map the lexers to human-readable language names.
- Add the custom lexers we support (Rainbow, Invisible).
We can use this list to implement T832, too.
So, specifically, this would boil down to:
- Add a public static getSupportedLanguages() to PhutilDefaultSyntaxHighlighterEngine.
- This returns a big hard-coded map of Pygments languages plus additional languages.
- Implement the modified check above ("if filename language isn't valid..").
- Add a rule to PhutilDefaultSyntaxHighlighterEngine->getLanguageFromFilename() which uses the entire filename if nothing matches (e.g., return basename($filename); instead of return null; at the end).
If you came here looking for a short answer:
You can update your configuration to do better:
→ →
You can add/update it to add your favorite missing language (By filename).
The upstream is very reluctant to just update the existing hard-coded map with new values, so don't try to submit patches for it.