Some file types (like Markdown and SVG) are text (or binary) files which can be rendered into some sort of document for presentation. For example, an SVG file is a text file but it can be rendered into an image.
In particular, we can render these files when:
- viewing a file in Diffusion;
- viewing a file in Files;
- viewing an attachment on any object, maybe with {Fxxx, ...} options;
- perhaps in Pholio, although this is maybe not v1; and
- we could automatically expose these to the inline {{{ ... }}} remarkup renderers.
Additionally, some files can be rendered into a visual diff format in a richer way than just showing two renderings side-by-side. We can potentially render a rich diff when:
- viewing a commit in Diffusion; or
- viewing a revision in Differential.
A good example of a rich diff is GitHub's image diffing feature, which has modes to overlay two images on top of one another:
https://help.github.com/articles/rendering-and-diffing-images/
Another good example is GitHub's STL model support:
- Rendering models: https://blog.github.com/2013-04-09-stl-file-viewing/
- Diffing models: https://help.github.com/articles/3d-file-viewer/
Even when we can't render a dedicated diff of an file, rendering the old file and the new file may still be better than rendering the raw text diff, although perhaps not always.
For example, one document type where a rendered old vs new may be worse than a text diff is Markdown: if a large section was deleted, added, or moved, the two documents won't line up at all and it may be very difficult to tell what was changed, while a plain text diff may actually be more clear.
A possible related concern is that some files might diff better in Diffusion/Differential using the prose diff algorithm than using the source diff algorithm.
The first step here is to add some sort of modular DocumentRenderingEngine (or something) which can elect to render a given document or a diff between two documents. Files, attachments, the lightbox, Diffusion file views, Diffusion commit diffs, and Differential revisions should all engage this engine, test for support, and provide the user some way to select between rendering options.
In particular, multiple engines should be able to support a given document and the user should be able to choose between rendering engines. One example where this might be useful is is formats like SVG: SVG can be rendered as an image, or it could be rendered by a generic XML renderer. A binary file format could be rendered as an actual document (say, an image or 3D model) or a hexdump -C style listing, or a decoded format-specific text variant.
For sophisticated rendered diffs, we will probably need to sacrifice inline comment support in most cases, at least for now. It's possible that we could eventually support different commenting modes on different renderings but this seems like a substantial undertaking. Comments can still work on the raw text version, we just can't reasonably map a comment on line 32 of an XML file into a point in 3D space when the file is rendered as a model. We could put comments on other views of the file above/below the rendering or just show an indicator that they exist.
Once the engine infrastructure works, we can add support for different file types.
A general concern here is that many file types are complicated to render and may not have secure rendering engines. SVG is a good example here: although browsers can all render SVG, rendering SVG safely is very hard (see T6445). LaTeX is similar (T6609) with major security concerns in the most popular rendering engines. We've previously rewritten figlet and cowsay with in-process/first-party code (see T7785) and removed Graphviz support because we were unable to secure it or port it (T9408).
Many rendering tools were written assuming that input documents are safe and were written by a trusted local user. Often, these tools can be exploited when that assumption is violated with a malicious input. (From T9408, a researcher found a buffer overflow accessible from the command line in figlet, a toy program for drawing big letter using keyboard symbols.)
Thus, upstreamable rendering implementations for some file types may be substantial undertakings where we either need to carefully audit a large external codebase or write a first-party version of the rendering pipeline. For specialized diffs, it's likely that we need first-party pipelines in most cases.
The existence of this engine and pipeline will open up opportunities for third-party code to provide insecure engines, but approaches based largely on faith that a 200,000 line external has no attack surface isn't upstreamable, so just keep it to yourself. 🙈 🙉 🙊
In the past, we've seen interest in:
- (T2849) Markdown
- (T2849) reStructuredText
- (T6609, PHI260, PHI442) LaTeX
- (T6445, PHI411) SVG
- (T4056) PDF
I don't currently plan to implement any of these since they're all complicated and have little or no mana behind them relative to implementation complexity.
These formats also seem conceptually reasonable, although not necessarily valuable:
- Remarkup (standalone is easy; diffs are hard).
- Normal images like PNG, GIF and JPEG (standalone is trivial but should probably be implemented via engine infrastructure; diffs aren't trivial but may be valuable).
- Binary as hexdump (standalone and diff are both easy but value is questionable).
- Prose diff, maybe.
- Raw binary (it would be nice if the fallback renderer that just says "we can't do anything with this file" was modular, not special cased).
- XML-as-DOM? JSON/YAML-as-DOM? We could render these formats in an expandable tree view and perform more-semantic diffing on them, but it's not clear that this is very useful relative to the amount of work it likely involves. Viewing these files as text is usually pretty good already, assuming they've been beautified by whatever wrote them. Maybe "Beautify JSON" is reasonable, but it's not clear this really solves a problem anyone has.
See PHI452. An install would like support for PCB schematic files (.sch); layer files (.brd, GDS, Gerber).
See PHI452. See PHI456. Multiple installs are interested in Jupyter notebooks.