Page MenuHomePhabricator

Allow Diviner to import documentation
Open, Needs TriagePublic

Description

It would be great if Diviner could import documentation from other systems, as JSON or XML. This would allow Diviner to support a wider range of languages.

Event Timeline

joshuaspence updated the task description. (Show Details)
joshuaspence raised the priority of this task from to Needs Triage.
joshuaspence added a subscriber: joshuaspence.

What about the ability to publish arbitrary HTML into Diviner? This is not ideal, but it means that any documentation system can be integrated.

My guess is that we can satisfy this by parsing a small, safe subset of HTML (bold, italics, tt, links).

For HTML documentation that you're aware of, does it contain complex HTML structures (layouts, images, CSS/Javascript)? All the stuff I've ever seen has simple formatting and then maybe some tables. I think we should be able to parse that stuff safely.

We could support arbitrary HTML with some bin/diviner allow-arbitrary-html-in-book example, and then anyone with edit permission could publish arbitrary HTML into it. However, this seems generally ultra-bad and I'd like to avoid it if we possibly can.

Yeah, agreed on avoiding it if possible. I just got asked about CSS documentation which I have no idea about but there are existing tools which spit out HTML. I will look into what would be involved to parse the HTML.

Ah, interesting. The cases I'm familiar with are Javadoc, like this lovely example from Oracle:

/**
 * Graphics is the abstract base class for all graphics contexts
 * which allow an application to draw onto components realized on
 * various devices or onto off-screen images.
 * A Graphics object encapsulates the state information needed
 * for the various rendering operations that Java supports.  This
 * state information includes:
 * <ul>
 * <li>The Component to draw on
 * <li>A translation origin for rendering and clipping coordinates
 * <li>The current clip
 * <li>The current color
 * <li>The current font
 * <li>The current logical pixel operation function (XOR or Paint)
 * <li>The current XOR alternation color
 *     (see <a href="#setXORMode">setXORMode</a>)
 * </ul>
 * <p>
 * Coordinates are infinitely thin and lie between the pixels of the
 * output device.
 * Operations which draw the outline of a figure operate by traversing
 * along the infinitely thin path with a pixel-sized pen that hangs
 * down and to the right of the anchor point on the path.
 * Operations which fill a figure operate by filling the interior
 * of the infinitely thin path.
 * Operations which render horizontal text render the ascending
 * portion of the characters entirely above the baseline coordinate.
 * <p>
 * Some important points to consider are that drawing a figure that
 * covers a given rectangle will occupy one extra row of pixels on
 * the right and bottom edges compared to filling a figure that is
 * bounded by that same rectangle.
 * Also, drawing a horizontal line along the same y coordinate as
 * the baseline of a line of text will draw the line entirely below
 * the text except for any descenders.
 * Both of these properties are due to the pen hanging down and to
 * the right from the path that it traverses.
 * <p>
 * All coordinates which appear as arguments to the methods of this
 * Graphics object are considered relative to the translation origin
 * of this Graphics object prior to the invocation of the method.
 * All rendering operations modify only pixels which lie within the
 * area bounded by both the current clip of the graphics context
 * and the extents of the Component used to create the Graphics object.
 * 
 * @author      Sami Shaio
 * @author      Arthur van Hoff
 * @version     %I%, %G%
 * @since       1.0
 */

This has long struck me as a bizarre decision, but I think there's a lot of that "HTML" in Java-world, and, at least conceptually, it shouldn't be too complicated to write a parser for it since it's more like "10 specific tags from HTML plus the href attribute on links", not exactly HTML5 + JS + CSS.

Ah yeah, I'm not really talking about HTML within the comments themselves (although I'm sure we will have to deal with that as well). Specifically, this is the tool that has I have been referred to: https://github.com/kss-node/kss-node

That looks like it could be pretty complicated to do safely.

We could possibly let you designate a "random documentation HTML stuff" subdomain and then iframe the examples in, although that would take a lot of setup and I think getting the iframe sizes right might be complicated.

Yeah. I will probably end up just writing my own NodeJS script to parse and interpret the CSS documentation.