Page MenuHomePhabricator

Implement alternate markup engines (Markdown, reStructuredText, ...)
Closed, WontfixPublic

Assigned To
Authored By
epriestley
Mar 29 2013, 1:32 AM
Referenced Files
None
Tokens
"Heartbreak" token, awarded by niedzielski."Dislike" token, awarded by tomekj2ee."Like" token, awarded by ponsfrilus.

Description

Umbrella task for requests that we implement some alternative to Remarkup, like Markdown or reStructuredText.

Why Remarkup is Really Good

Why does Phabricator use its own custom markup language ("Remarkup") instead of an existing language (like Markdown)?

  • Remarkup has a multi-stage, batched rendering pipeline.
  • Remarkup supports multiple rendering targets.
  • Remarkup is extensible.
  • Remarkup is very similar to Markdown.
  • No one implements Markdown anyway.
  • Some Markdown syntaxes are bad for discussing software.
  • Remarkup is dramatically more powerful than Markdown.

Multi-stage rendering: Remarkup has a rendering pipeline which allows us parse and finalize remarkup documents in separate batched stages, and use caches while keeping dynamic parts (e.g., object visibility) dynamic. Other markup engines generally have a single-stage, unbatched pipeline. A single-stage pipeline has far poorer performance than Remarkup's pipeline.

Multiple rendering targets: Remarkup supports rendering to HTML and to plain text (for email). Some other engines do not.

Extensible: Remarkup is rule-based with a flexible grammar. New rules can be implemented in PHP in a few minutes. Popular Markdown parsers like Sundown are hand-rolled C.

Similar to Markdown: Remarkup is already very similar to Markdown.

No one implements Markdown: Sometimes, the argument is made that GitHub (or some other service) implements Markdown so we should too. But GitHub doesn't implement markdown -- it implements "GitHub flavored Markdown", which is about as similar to Markdown as Remarkup is. Even beyond this, part of Markdown is inline HTML, which obviously no one implements because it's completely insecure.

Markdown and Software: Markdown includes a rule for _emphasis_, which conflicts with source code symbols. Markdown includes rules around [...] and (...) which conflict with array and call notation in source code, e.g. $a['callback']($param); is rendered as $a'callback';.

More Powerful: Remarkup has many rules which Markdown does not, like: code blocks with syntax highlighting, object references, object embeds, mentions, macros, memes, tables, and YouTube videos. These features are extremely powerful and a core part of Phabricator's value.

Why Implementing Other Engines is Hard

We can't just dump in some other engine in 20 minutes, either.

Security: Many markup language specifications and/or implementations are not secure for user-provided text on the web. We need to select an implementation and audit it before we can use it. This is time consuming because markup parsers are complicated and getting them right is difficult. We have a higher bar for security than many applications do.

Features: Other markup engines don't have features like code blocks, mentions, text rendering targets, and all the other stuff mentioned above. Implementing these features, especially in more than one markup engine, is prohibitively complex. Many markup engines are not very flexible or extensible.

Performance: Naive performance of other parsers will be worse (and, in some cases, dramatically worse) than Remarkup. Getting them up to par (e.g. integrated with the cache pipeline) is potentially a large amount of work.

Portability: Generally, we can't convert between markup formats because they have different feature sets. This creates UI problems where we potentially need to let the user switch between markup engines, and storage concerns where we potentially need to track which markup engine a block of text uses.

The Way Forward

New Rules: We're happy to implement new Remarkup rules, provided they do not conflict with other rules or with common syntax in discussing programs and source code. These can usually be implemented very quickly. If you are dissatisfied with Remarkup because it does not support specific rules, implementing rules is far more actionable than swapping engines.

Diffusion: We will eventually implement display rules in Diffusion so that .rst, .md, etc., documents are marked up correctly in non-blame views. This is blocked mostly by security. This is a low priority.

Diviner: We will eventually implement alternate engines in Diviner, but this is a very low priority. We may encounter limits or difficulties with some Diviner capabilities and alternate markup engines.

Phriction: I'm not sure what we'll do here. The specific case that makes this difficult is installs with existing documents in another Wiki which they'd like to import.

Comments, Summaries, etc.: We will probably never implement an alternate engine for most uses. Alternate engines are crippled in this role because they do not have the pipeline or context-sensitive and application-aware rules which make Remarkup powerful.

Conversion Between Markup Languages: It is unlikely we'll build this. If we do, it will probably be a one-way Markdown -> Remarkup converter. You can get things into Markdown from other engines using something like pandoc.

Revisions and Commits

rPHU libphutil
Restricted Differential Revision
Restricted Differential Revision
rP Phabricator
Restricted Differential Revision
Restricted Differential Revision

Event Timeline

epriestley triaged this task as Wishlist priority.Mar 29 2013, 1:32 AM
epriestley added a project: Remarkup.
epriestley added a subscriber: epriestley.
epriestley edited this Maniphest Task.
epriestley changed the visibility from "All Users" to "Public (No Login Required)".Oct 4 2013, 3:54 PM

T7854 is a request to support Markdown image format

![Alt text](/path/to/img.jpg)

![Alt text](/path/to/img.jpg "Optional title")

@epriestley to be honest, this kind of syntax was particular interested in the repositories on not actually in the wiki.
My git documentation is using markdown and referring local images (stored in repository).

When you write ![alt text](path/to/readme.txt) and the resource is not an image we have to handle that sensibly.

When you write ![alt text](path/to/evil.svg), we have to handle that safely -- not execute arbitrary Javascript.

When you write ![alt text](../../../ws/admin/) and are proxying the notification channel websocket via nginx, we have to reject that instead of exposing sensitive information.

I understand the request. This is blocked on T4190.

Ok make sense, thanks for clarification.

Another behavioral difference which came up recently is this:

# Header
## Subheader

We render this as:

  1. Header
    1. Subheader

...but in Markdown it is:

Header

Subheader

We can likely resolve this in the majority of cases by making the numbered-list heuristic more permissive: specifically, require # lists to have multiple items at a single level before interpreting them as lists.

Any way to convert existing documentation from Markdown to Remarkup? Thanks.

@moy.easwaran and anyone else interested, I just went through the process of migrating a wiki from GitHub to Phriction. Here's the code I wrote to do it: https://github.com/aaron11496/github-wiki-to-phriction

A custom Pandoc writer can take you most of the way there, but for things like embedded images you would need something that can upload files.

I think that if Phabricator is going to insist on its own markup language, then y'all should play to win, and not limp in with a "me too" format. Evan's writeup in this task is compelling, but being buried in Phabricator in a four digit ticket number is not really playing to win. @aaron11496 has some promising looking work on a Remarkup conversion plugin for pandoc, but it's clearly only a start.

Markdown seems to be winning as a de facto standard. That's not to say Phabricator needs to switch to it, but Phabricator shouldn't stubbornly try to lead without followers on this front.

@robla Every open task here will be dealt with by the upstream, that is our intention. People are welcome to shift upstream's priority via Paid Paid Prioritization or wait until the upstream naturally prioritizes it (see Planning). We'd love to be able to address more concerns than we currently can, but we're a very small team.

@aaron11496 Community Resources would be a great place to add your code as well.

epriestley claimed this task.

See followup in T13105.