Plans: Rich presentation and diff rendering pipelines for various file types
Open, NormalPublic
Actions

Assigned To

None

Authored By

	epriestley
	Mar 14 2018, 6:43 AM

Description

Some file types (like Markdown and SVG) are text (or binary) files which can be rendered into some sort of document for presentation. For example, an SVG file is a text file but it can be rendered into an image.

In particular, we can render these files when:

viewing a file in Diffusion;
viewing a file in Files;
viewing an attachment on any object, maybe with {Fxxx, ...} options;
perhaps in Pholio, although this is maybe not v1; and
we could automatically expose these to the inline {{{ ... }}} remarkup renderers.

Additionally, some files can be rendered into a visual diff format in a richer way than just showing two renderings side-by-side. We can potentially render a rich diff when:

viewing a commit in Diffusion; or
viewing a revision in Differential.

A good example of a rich diff is GitHub's image diffing feature, which has modes to overlay two images on top of one another:

https://help.github.com/articles/rendering-and-diffing-images/

Another good example is GitHub's STL model support:

Rendering models: https://blog.github.com/2013-04-09-stl-file-viewing/
Diffing models: https://help.github.com/articles/3d-file-viewer/

Even when we can't render a dedicated diff of an file, rendering the old file and the new file may still be better than rendering the raw text diff, although perhaps not always.

For example, one document type where a rendered old vs new may be worse than a text diff is Markdown: if a large section was deleted, added, or moved, the two documents won't line up at all and it may be very difficult to tell what was changed, while a plain text diff may actually be more clear.

A possible related concern is that some files might diff better in Diffusion/Differential using the prose diff algorithm than using the source diff algorithm.

The first step here is to add some sort of modular DocumentRenderingEngine (or something) which can elect to render a given document or a diff between two documents. Files, attachments, the lightbox, Diffusion file views, Diffusion commit diffs, and Differential revisions should all engage this engine, test for support, and provide the user some way to select between rendering options.

In particular, multiple engines should be able to support a given document and the user should be able to choose between rendering engines. One example where this might be useful is is formats like SVG: SVG can be rendered as an image, or it could be rendered by a generic XML renderer. A binary file format could be rendered as an actual document (say, an image or 3D model) or a hexdump -C style listing, or a decoded format-specific text variant.

For sophisticated rendered diffs, we will probably need to sacrifice inline comment support in most cases, at least for now. It's possible that we could eventually support different commenting modes on different renderings but this seems like a substantial undertaking. Comments can still work on the raw text version, we just can't reasonably map a comment on line 32 of an XML file into a point in 3D space when the file is rendered as a model. We could put comments on other views of the file above/below the rendering or just show an indicator that they exist.

Once the engine infrastructure works, we can add support for different file types.

A general concern here is that many file types are complicated to render and may not have secure rendering engines. SVG is a good example here: although browsers can all render SVG, rendering SVG safely is very hard (see T6445). LaTeX is similar (T6609) with major security concerns in the most popular rendering engines. We've previously rewritten figlet and cowsay with in-process/first-party code (see T7785) and removed Graphviz support because we were unable to secure it or port it (T9408).

Many rendering tools were written assuming that input documents are safe and were written by a trusted local user. Often, these tools can be exploited when that assumption is violated with a malicious input. (From T9408, a researcher found a buffer overflow accessible from the command line in figlet, a toy program for drawing big letter using keyboard symbols.)

Thus, upstreamable rendering implementations for some file types may be substantial undertakings where we either need to carefully audit a large external codebase or write a first-party version of the rendering pipeline. For specialized diffs, it's likely that we need first-party pipelines in most cases.

The existence of this engine and pipeline will open up opportunities for third-party code to provide insecure engines, but approaches based largely on faith that a 200,000 line external has no attack surface isn't upstreamable, so just keep it to yourself. 🙈 🙉 🙊

In the past, we've seen interest in:

(T2849) Markdown
(T2849) reStructuredText
(T6609, PHI260, PHI442) LaTeX
(T6445, PHI411) SVG
(T4056) PDF

I don't currently plan to implement any of these since they're all complicated and have little or no mana behind them relative to implementation complexity.

These formats also seem conceptually reasonable, although not necessarily valuable:

Remarkup (standalone is easy; diffs are hard).
Normal images like PNG, GIF and JPEG (standalone is trivial but should probably be implemented via engine infrastructure; diffs aren't trivial but may be valuable).
Binary as hexdump (standalone and diff are both easy but value is questionable).
Prose diff, maybe.
Raw binary (it would be nice if the fallback renderer that just says "we can't do anything with this file" was modular, not special cased).
XML-as-DOM? JSON/YAML-as-DOM? We could render these formats in an expandable tree view and perform more-semantic diffing on them, but it's not clear that this is very useful relative to the amount of work it likely involves. Viewing these files as text is usually pretty good already, assuming they've been beautified by whatever wrote them. Maybe "Beautify JSON" is reasonable, but it's not clear this really solves a problem anyone has.

See PHI452. An install would like support for PCB schematic files (.sch); layer files (.brd, GDS, Gerber).

See PHI452. See PHI456. Multiple installs are interested in Jupyter notebooks.

Revisions and Commits

rP Phabricator
		D19457	rPa894c9993512 Add "max-width: 100%;" to stop large images from overflowing the new rendering…
		D19414	rPafc3099ee785 Add a view option to disable blame in Diffusion and fix some view transition…
		D19378	rP665529ab60a1 Restore coverage reporting to Diffusion browse UI
		D19377	rP21bb0215dbc7 Remove obsoleted "diffusion-browse-file" behavior for coverage
		D19350	rP37a03402bc44 When following a link to a particular line ("/example.txt$12"), scroll to that…
		D19349	rP5b3a351852a7 Use pseudoelements, not Zero Width Space, to implement copy/paste behavior in…
		D19348	rPc5c53e277a28 Make line selection in source code views less fragile and more consistent
		D19346	rP55619e89642b Restore an explicit white background color to files in Paste
		D19314	rP472bc3d90a59 Colorize lines in blame under DocumentEngine, to show relative age of changes
		D19313	rPcf75d63b49a7 When lines 12, 13, 14, etc all blame to the same change, only show it once
		D19312	rPeb80f0a2d95d When you swap between document rendering engines, populate or redraw blame if…
		D19311	rPeca7dc25f20b Use javelin_tag(), not phutil_tag(), to render revision blame tooltips properly
		D19310	rP11664277b33d Make DocumentEngine source line linking behavior better when blame is shown
		D19309	rP09c6d42b9599 Mostly make blame work with DocumentEngine
		D19307	rP90a614778c07 Make repository symbol references work with DocumentEngine
		D19306	rP0363febeb2c6 Disable default syntax highlighting for large files in DocumentEngine
		D19305	rP6dea2ba3b37f Fix DocumentEngine line behaviors in Diffusion
		D19302	rP1fde4a945061 Move Diffusion browse rendering to DocumentEngine, breaking almost all features
		D19301	rP245132a0b22e Pull file Document Engine rendering out of "Files" application controllers
		D19300	rP7d4e25614d00 Remove the ability to disable blame in Diffusion
		D19274	rP7189cb7ba816 Support text encoding and syntax highlighting options in document rendering
		D19273	rPccbc8a430f62 Make Jupyter notebooks use the fast builtin Python highlighter
	Audited	D19260	rPc5b244bfd04f Render directly embedded image data represented as a string in Jupyter notebooks
	Audited	D19259	rPb7d3101e7cbf Minor document rendering fixes: dropdown for synchronous files, URI…
		D19258	rP38999e25ac59 Support logged-out access to the document rendering endpoint
		D19256	rPbba1b185f834 Improve minor client behaviors for document rendering
		D19254	rPd2727d24da6e Add an abstract "Text" document engine and a "Source" document engine
		D19253	rPcbf3d3c37154 Add a very rough, proof-of-concept Jupyter notebook document engine
		D19252	rPfb4ce851c4ae Add a PDF document "rendering" engine
		D19251	rP8b658706a8c4 Add a basic Remarkup document rendering engine
		D19239	rP4aafce686202 Add filesize limits for document rendering engines and support partial/complete…
		D19238	rPf646153f4dbe Add an async driver for document rendering and a crude "Hexdump" document engine
		D19237	rP01f22a8d06c9 Roughly modularize document rendering in Files

Related Objects
Search...

		Status	Assigned	Task
		Open	None	T13105 Plans: Rich presentation and diff rendering pipelines for various file types
		Open	None	T13126 Plans: Diffusion cleanup for document engine

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

mydeveloperday awarded a token.Mar 22 2018, 9:59 AM

epriestley added a revision: D19251: Add a basic Remarkup document rendering engine.Mar 23 2018, 11:04 AM

I really like this, Remarkup idea, I think there are many text based files that people add to reviews which would be easily transformed to something more readable by being transfer directly to Remarkup, even for example something as simple as CSV,

but also potentially something much more complex like MIF (Framemaker) or even the HTML that might be part of a HtmlHelp file. so I could see using this rendering engine as a basis for other possible engines

I've personally already customized an engine to transform a proprietary graph format into an SVG diagram with some success, I'm interested to see how easy it will be for the render engine to be used in Diffusion and Differential

Super excited about this plan..

epriestley mentioned this in T13112: Safari, PDFs, and Content-Security-Policy interact oddly.Mar 23 2018, 11:47 AM

epriestley added a revision: D19252: Add a PDF document "rendering" engine.Mar 23 2018, 12:10 PM

epriestley added a revision: D19253: Add a very rough, proof-of-concept Jupyter notebook document engine.Mar 23 2018, 2:06 PM

epriestley added a commit: rP8b658706a8c4: Add a basic Remarkup document rendering engine.Mar 23 2018, 2:07 PM

epriestley added a commit: rPfb4ce851c4ae: Add a PDF document "rendering" engine.Mar 23 2018, 2:14 PM

epriestley added a commit: rPcbf3d3c37154: Add a very rough, proof-of-concept Jupyter notebook document engine.

epriestley added a revision: D19254: Add an abstract "Text" document engine and a "Source" document engine.Mar 23 2018, 2:41 PM

epriestley added a commit: rPd2727d24da6e: Add an abstract "Text" document engine and a "Source" document engine.Mar 23 2018, 7:28 PM

epriestley added a revision: D19256: Improve minor client behaviors for document rendering.Mar 23 2018, 8:49 PM

epriestley added a commit: rPbba1b185f834: Improve minor client behaviors for document rendering.Mar 23 2018, 9:09 PM

epriestley mentioned this in 2018 Week 12 (Late March).Mar 23 2018, 9:49 PM

epriestley added a revision: D19258: Support logged-out access to the document rendering endpoint.Mar 26 2018, 2:16 PM

epriestley added a revision: D19259: Minor document rendering fixes: dropdown for synchronous files, URI normalization for default renderers.Mar 26 2018, 2:20 PM

NMeral added a subscriber: NMeral.Mar 26 2018, 2:22 PM

NMeral removed a subscriber: NMeral.

epriestley added a revision: D19260: Render directly embedded image data represented as a string in Jupyter notebooks.Mar 26 2018, 2:23 PM

Changeset rPfb4ce851c4ae: Add a PDF document "rendering" engine allows to view PDFs from the Files application. However, {Fxxx, ...} references to a PDF file in Phriction still only allow to download the file or open a lightbox with only an icon (again with a download link).

Are there plans to add PDF rendering similar to Files also to Phriction ? This would be really helpful.

epriestley added a commit: rP38999e25ac59: Support logged-out access to the document rendering endpoint.Mar 28 2018, 10:01 PM

epriestley added a commit: rPb7d3101e7cbf: Minor document rendering fixes: dropdown for synchronous files, URI….Mar 28 2018, 10:07 PM

epriestley added a commit: rPc5b244bfd04f: Render directly embedded image data represented as a string in Jupyter notebooks.

epriestley added a revision: D19273: Make Jupyter notebooks use the fast builtin Python highlighter.Mar 30 2018, 4:48 PM

epriestley added a revision: D19274: Support text encoding and syntax highlighting options in document rendering.Mar 30 2018, 6:15 PM

epriestley added a commit: rPccbc8a430f62: Make Jupyter notebooks use the fast builtin Python highlighter.Mar 30 2018, 6:26 PM

epriestley added a commit: rP7189cb7ba816: Support text encoding and syntax highlighting options in document rendering.Mar 30 2018, 6:29 PM

epriestley added a revision: D19300: Remove the ability to disable blame in Diffusion.Apr 5 2018, 2:33 PM

epriestley added a revision: D19301: Pull file Document Engine rendering out of "Files" application controllers.Apr 5 2018, 4:17 PM

Junk I'm about to break in Diffusion (but plan to restore):

Line numbers are a mess: they don't read off DiffusionRequest properly and the URI doesn't update correctly.
Disable syntax highlighting for very large files (should move to document engine)
Symbol linking
Mobile?
Blame
Also blame for large files?
Coverage
Lint messages
Many dangling/leftover behaviors and maybe styles?

If a file is non-UTF8, we should try the repository's "Encoding" as a default (or maybe just remove this feature).

epriestley mentioned this in T3788: Improve encoding management in Diffusion.Apr 5 2018, 5:51 PM

epriestley mentioned this in T8407: Control diffusion's syntax highlighting.

epriestley mentioned this in T2442: Show images in Diffusion diffs.Apr 5 2018, 5:54 PM

epriestley added a revision: D19302: Move Diffusion browse rendering to DocumentEngine, breaking almost all features.Apr 5 2018, 6:40 PM

Compatibility:

The secret $1-2,7-9 syntax for highlighting multiple noncontiguous blocks of lines in Diffusion is no longer supported.

epriestley added a revision: D19305: Fix DocumentEngine line behaviors in Diffusion.Apr 8 2018, 1:33 PM

epriestley mentioned this in T7895: Make HIGHLIGHT_BYTE_LIMIT configurable.Apr 8 2018, 3:05 PM

epriestley added a revision: D19306: Disable default syntax highlighting for large files in DocumentEngine.Apr 8 2018, 3:13 PM

epriestley added a revision: D19307: Make repository symbol references work with DocumentEngine.Apr 8 2018, 4:34 PM

epriestley added a revision: D19309: Mostly make blame work with DocumentEngine.Apr 8 2018, 7:53 PM

epriestley added a commit: rP7d4e25614d00: Remove the ability to disable blame in Diffusion.Apr 9 2018, 11:45 AM

epriestley added a commit: rP245132a0b22e: Pull file Document Engine rendering out of "Files" application controllers.

epriestley added a commit: rP1fde4a945061: Move Diffusion browse rendering to DocumentEngine, breaking almost all features.

epriestley added a commit: rP6dea2ba3b37f: Fix DocumentEngine line behaviors in Diffusion.

epriestley added a commit: rP0363febeb2c6: Disable default syntax highlighting for large files in DocumentEngine.

epriestley added a commit: rP90a614778c07: Make repository symbol references work with DocumentEngine.Apr 9 2018, 11:47 AM

epriestley added a commit: rP09c6d42b9599: Mostly make blame work with DocumentEngine.

Still needs work:

Degrading and error messages when blaming large files.
"Skip Past This Commit" doesn't retain line numbers.
When we know the commit created the file, disable the "skip past this commit" stuff.
Symbol highlighting has some glitchiness in Safari, e.g. typehints.
Coverage.
Lint messages.
General cleanup of old behaviors?

Now OK?

Tooltips on revisions links are being built wrong.
Swapping away from source and back doesn't populate blame out of cache properly.
When lines 15, 16, 17... all have the same blame, they should only show the links on the first line.
Line number coloration for age (older = darker; newer = brighter).
CSS is a little funky, e.g. the "/" in "commit / revision" is off-baseline.
The "line-linker" behavior for mapping rows to line numbers is a little sketchy.
Line-linker behavior isn't jumping to the linked lines, but Diffusion previously did.

epriestley added a revision: D19310: Make DocumentEngine source line linking behavior better when blame is shown.Apr 9 2018, 12:16 PM

epriestley added a revision: D19311: Use javelin_tag(), not phutil_tag(), to render revision blame tooltips properly.Apr 9 2018, 12:20 PM

epriestley added a revision: D19312: When you swap between document rendering engines, populate or redraw blame if appropriate.Apr 9 2018, 12:26 PM

epriestley added a revision: D19313: When lines 12, 13, 14, etc all blame to the same change, only show it once.Apr 9 2018, 12:40 PM

epriestley added a revision: D19314: Colorize lines in blame under DocumentEngine, to show relative age of changes.Apr 9 2018, 1:09 PM

epriestley added a commit: rP11664277b33d: Make DocumentEngine source line linking behavior better when blame is shown.Apr 9 2018, 1:09 PM

epriestley added a commit: rPeca7dc25f20b: Use javelin_tag(), not phutil_tag(), to render revision blame tooltips properly.

epriestley added a commit: rPeb80f0a2d95d: When you swap between document rendering engines, populate or redraw blame if….

epriestley added a commit: rPcf75d63b49a7: When lines 12, 13, 14, etc all blame to the same change, only show it once.

epriestley added a commit: rP472bc3d90a59: Colorize lines in blame under DocumentEngine, to show relative age of changes.Apr 9 2018, 1:11 PM

pablocarrillo added a subscriber: pablocarrillo.Apr 10 2018, 3:42 PM

Source view rendering in Paste has a missing background color.

epriestley added a revision: D19346: Restore an explicit white background color to files in Paste.Apr 11 2018, 8:45 PM

epriestley added a revision: D19348: Make line selection in source code views less fragile and more consistent.Apr 11 2018, 9:15 PM

epriestley added a revision: D19349: Use pseudoelements, not Zero Width Space, to implement copy/paste behavior in Paste/Diffusion.Apr 11 2018, 9:30 PM

epriestley added a revision: D19350: When following a link to a particular line ("/example.txt$12"), scroll to that line.Apr 11 2018, 9:47 PM

epriestley added a commit: rP55619e89642b: Restore an explicit white background color to files in Paste.Apr 12 2018, 12:21 AM

epriestley added a commit: rPc5c53e277a28: Make line selection in source code views less fragile and more consistent.Apr 12 2018, 12:27 AM

epriestley added a commit: rP5b3a351852a7: Use pseudoelements, not Zero Width Space, to implement copy/paste behavior in….

epriestley added a commit: rP37a03402bc44: When following a link to a particular line ("/example.txt$12"), scroll to that….

epriestley mentioned this in D19367: Fix the legacy "25, 50, 100, unlimited" Harbormaster log links to respect generation selection.Apr 13 2018, 1:50 PM

epriestley mentioned this in rPca49fffc1bad: Fix the legacy "25, 50, 100, unlimited" Harbormaster log links to respect….Apr 13 2018, 6:55 PM

epriestley mentioned this in 2018 Week 15 (Mid April).Apr 13 2018, 11:22 PM

This particular page here is fataling:

https://secure.phabricator.com/source/services/blame/master/README;140004446f6cea1d5668331b97dfcdcb2e26dc98

From:

https://secure.phabricator.com/source/services/browse/master/README

epriestley mentioned this in T13125: Plans: Code coverage reporting.Apr 16 2018, 12:47 PM

Coverage.

See T13125 for a more detailed breakout of coverage plans.

epriestley mentioned this in T13124: Plans: 2018 Week 16 Bonus Content.Apr 16 2018, 12:52 PM

epriestley added a revision: D19377: Remove obsoleted "diffusion-browse-file" behavior for coverage.Apr 17 2018, 12:54 PM

epriestley added a revision: D19378: Restore coverage reporting to Diffusion browse UI.Apr 17 2018, 1:46 PM

epriestley added a commit: rP21bb0215dbc7: Remove obsoleted "diffusion-browse-file" behavior for coverage.Apr 17 2018, 9:51 PM

epriestley added a commit: rP665529ab60a1: Restore coverage reporting to Diffusion browse UI.

epriestley mentioned this in T13126: Plans: Diffusion cleanup for document engine.Apr 17 2018, 10:02 PM

epriestley added a subtask: T13126: Plans: Diffusion cleanup for document engine.

From PHI604, for completeness, on the newer behavior of "Hide Blame":

I'd like to avoid restoring a per-user toggle preference for the default [show/hide blame] behavior. One reason for this is that it meant that links produce materially different pages for different people: if you turned off the old preference, you may see a page without blame. You send someone a link to it, but they have the preference on and see a page with blame (or vice versa). For example, your innocent link might be perceived as finger-pointing when your recipient sees blame, although you didn't. Or your link may be confusing to your recipient when you discuss blame/history information but they can't see it. The new behavior means that your recipient will see the same page with the same information you do.

If the "Hide Blame" option isn't enough, we could look at other options here (maybe per-repository or per-install "disable blame by default" modes, although I really hate adding options) but I'd like to avoid returning to a per-user setting here because I want links to look the same to different viewers.

epriestley added a revision: D19414: Add a view option to disable blame in Diffusion and fix some view transition bugs.Apr 28 2018, 1:55 PM

epriestley added a commit: rPafc3099ee785: Add a view option to disable blame in Diffusion and fix some view transition….Apr 30 2018, 10:32 PM

epriestley added a revision: D19457: Add "max-width: 100%;" to stop large images from overflowing the new rendering engine UI.Jun 1 2018, 8:14 PM

epriestley added a commit: rPa894c9993512: Add "max-width: 100%;" to stop large images from overflowing the new rendering….Jun 1 2018, 9:53 PM

• GoogleLegacy added a subscriber: • GoogleLegacy.Sep 10 2018, 8:43 AM

• yingshu added a subscriber: • yingshu.Jan 18 2019, 12:12 PM

Pawka added a subscriber: Pawka.Feb 6 2019, 3:03 PM

epriestley mentioned this in D20188: Use `<td class="n" data-n="3">` instead of `<th>3</th>` for line numbers.Feb 16 2019, 7:37 PM

epriestley mentioned this in rP98fe8fae4a90: Use `<td class="n" data-n="3">` instead of `<th>3</th>` for line numbers.Feb 19 2019, 11:07 PM

epriestley mentioned this in T13425: Support diffs between abstract block lists in the UI.Sep 25 2019, 4:19 PM

epriestley moved this task from Backlog to Next on the Differential board.Mar 15 2021, 5:11 PM

cspeckmim added a subscriber: cspeckmim.Mar 19 2021, 2:20 AM