We don't detect the language of files named "Makefile" correctly. Pygments does; it looks like the rule Pygments uses is "if there is no dot in the filename, use the whole filename", since the string "makefile" does not appear in the Pygments source so I believe this isn't a special case.
However, our logic currently looks like this:
$lang = detect_from_filename($filename);
if ($lang === null) {
$lang = detect_from_content($data);
}
If we add the "use the whole filename", we'll always get a language out of `detect_from_filename()`, so we'll never invoke the data-based language guesser. The logic needs to look like this instead:
$lang = detect_from_filename($filename);
if (!is_valid_language($lang)) {
$data_lang = detect_from_content($data);
if (is_valid_langauge($data_lang)) {
$lang = $data_lang;
}
}
Implementing `is_valid_language()` is a bit tricky. A language is valid if we support it explicitly, or if Pygments supports it. We can get a list of langauges Pygments supports with `pygmentize -L lexers`, but we don't want to run or parse this every time. The list may change based on Pygments versions, too.
I think the simplest approach is:
- Just copy-paste the list out of modern Pygments into libphutil, and we'll update it once a year or whatever (as new languages are created).
- Map the lexers to human-readable language names.
- Add the custom lexers we support (Rainbow, Invisible).
We can use this list to implement T832, too.
So, specifically, this would boil down to:
- Add a `public static getSupportedLanguages()` to `PhutilDefaultSyntaxHighlighterEngine`.
- This returns a big hard-coded map of Pygments languages plus additional languages.
- Implement the modified check above ("if filename language isn't valid..").
- Add a rule to `PhutilDefaultSyntaxHighlighterEngine->getLanguageFromFilename()` which uses the entire filename if nothing matches (e.g., `return basename($filename);` instead of `return null;` at the end).
-----------
= If you came here looking for a short answer:
You can update your configuration to do better:
{nav name=Config, icon=sliders> name=Syntax Highlighting, icon=code > name=syntax.filemap , icon=pencil-square-o }
You can add/update it to add your favorite missing language (By filename).
The upstream is very reluctant to just update the existing hard-coded map with new values, so don't try to submit patches for it.