Page MenuHomePhabricator

Display some invisible/nonprintable characters in diffs by default
ClosedPublic

Authored by epriestley on Feb 17 2019, 10:14 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Apr 21, 3:40 PM
Unknown Object (File)
Sat, Apr 20, 4:36 PM
Unknown Object (File)
Wed, Apr 17, 2:29 PM
Unknown Object (File)
Thu, Apr 11, 6:56 AM
Unknown Object (File)
Sun, Mar 31, 11:28 AM
Unknown Object (File)
Fri, Mar 29, 12:01 AM
Unknown Object (File)
Mar 16 2024, 10:55 AM
Unknown Object (File)
Mar 13 2024, 12:01 PM
Subscribers
None

Details

Summary

Ref T12822. Ref T2495. This is the good version of D20193.

Currently, we display various nonprintable characters (ZWS, nonbreaking space, various control characters) as themselves, so they're generally invisible.

In T12822, one user reports that all their engineers frequently type ZWS characters into source somehow? I don't really believe this (??), and this should be fixed in lint.

That said, the only real reason not to show these weird characters in a special way was that it would break copy/paste: if we render ZWS as "πŸ‘", and a user copy-pastes the line including the ZWS, they'll get a sheep.

At least, they would have, until D20191. Now that this whole thing is end-to-end Javascript magic, we can copy whatever we want.

In particular, we can render any character X as <span data-copy-text="Y">X</span>, and then copy "Y" instead of "X" when the user copies the node. Limitations:

  • If users select only "X", they'll get "X" on their clipboard. This seems fine. If you're selecting our ZWS marker *only*, you probably want to copy it?
  • If "X" is more than one character long, users will get the full "Y" if they select any part of "X". At least here, this only matters when "X" is several spaces and "Y" is a tab. This also seems fine.
  • We have to be kind of careful because this approach involves editing an HTML blob directly. However, we already do that elsewhere and this isn't really too hard to get right.

With those tools in hand:

  • Replace "\t" (raw text / what gets copied) with the number of spaces to the next tab stop for display.
  • Replace ZWS and NBSP (raw text) with a special marker for display.
  • Replace control characters 0x00-0x19 and 0x7F, except for "\t", "\r", and "\n", with the special unicode "control character pictures" reserved for this purpose.
Test Plan
  • Generated and viewed a file like this one:

Screen Shot 2019-02-17 at 1.42.03 PM.png (1Γ—855 px, 81 KB)

  • Copied text out of it, got authentic raw original source text instead of displayed text.

Diff Detail

Repository
rP Phabricator
Branch
sorcery6
Lint
Lint Passed
SeverityLocationCodeMessage
Advicesrc/applications/differential/parser/DifferentialChangesetParser.php:1620XHP16TODO Comment
Unit
Tests Passed
Build Status
Buildable 22050
Build 30126: Run Core Tests
Build 30125: arc lint + arc unit

Event Timeline

avivey retitled this revision from Display some invisible/nonprintable characters by in diffs by default to Display some invisible/nonprintable characters in diffs by default.Feb 18 2019, 12:08 AM
amckinley added inline comments.
src/applications/differential/parser/DifferentialChangesetParser.php
193

Not that it matters, but this went from 11 to 14 just because of your current stacked diffs condition, right?

src/applications/differential/render/DifferentialChangesetRenderer.php
365–368

Is it intentional that the new whitespace indentation highlighting code triggers here?

This revision is now accepted and ready to land.Feb 19 2019, 11:11 PM
src/applications/differential/parser/DifferentialChangesetParser.php
193

Right, 12 and 13 are elsewhere but not dependencies of this change.

src/applications/differential/render/DifferentialChangesetRenderer.php
365–368

No, D20195 should fix this.

This revision was automatically updated to reflect the committed changes.