Page MenuHomePhabricator

Support Excel as a data export format
ClosedPublic

Authored by epriestley on Jan 29 2018, 3:22 PM.
Tags
None
Referenced Files
F14047505: D18955.id.diff
Thu, Nov 14, 4:06 AM
F14003286: D18955.id45479.diff
Sat, Oct 26, 4:23 AM
F14003108: D18955.diff
Sat, Oct 26, 2:21 AM
F13963820: D18955.id.diff
Oct 15 2024, 6:04 PM
Unknown Object (File)
Oct 13 2024, 2:44 AM
Unknown Object (File)
Oct 10 2024, 12:31 AM
Unknown Object (File)
Sep 30 2024, 4:52 AM
Unknown Object (File)
Sep 21 2024, 6:25 AM
Subscribers
None

Details

Summary

Depends on D18954. Ref T13049. This brings over the existing Maniphest Excel export pipeline in a generic way.

The <Type>ExportField classes know directly that PHPExcel exists, which is a little sketchy, but writing an Excel indirection layer sounds like a lot of work and I don't anticipate us changing Excel backends anytime soon, so trying to abstract this feels YAGNI.

This doesn't bring over the install instructions for PHPExcel or the detection of whether or not it exists. I'll bring that over in a future change.

Test Plan

Exported users as Excel, opened them up, got a sensible-looking Excel sheet.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

amckinley added inline comments.
src/infrastructure/export/PhabricatorExcelExportFormat.php
29–31

This is off-topic, but what does this annotation do?

56

Excel doesn't do something sensible by default, like "set column width to the maximum width present in the data"?

129–130

Are we guaranteed to always have a header? I guess since the base class provides id and PHID, even janky extensions should have at least those fields available.

src/infrastructure/export/PhabricatorExportField.php
47

Magic constant?

src/infrastructure/export/PhabricatorIDExportField.php
11

More magic?

This revision is now accepted and ready to land.Jan 29 2018, 9:30 PM

Yeah, all the widths are sort of inherently magic.

I think they're in some kind of "Excel Unit" and it depends on font choices anyway and I'm looking at these sheets in Numbers rather than Excel, so who knows.

If we don't set anything, the sheet opens up with every column very narrow so almost all data is hidden, so my general goal is just to give the columns a bit more breathing room so the sheet is a bit easier to work with. Users can always resize columns, the default behavior is just particularly unfriendly.

Here's the first sheet I exported before tweaking stuff, in Numbers:

Screen Shot 2018-01-29 at 1.46.19 PM.png (1×1 px, 299 KB)

Here's where stuff ended up:

Screen Shot 2018-01-29 at 1.47.16 PM.png (1×1 px, 509 KB)

A totally ideal approach which actually measured column widths and fit columns to data would produce a different result than that, but I think we're like 80% of the way there, and I'm not sure we have a practical way to do font metrics and fitting.

Some Googling suggests there's stuff we could maybe try, but that it's all guesses/estimates and may imply a large performance cost, e.g., https://stackoverflow.com/questions/16761897/phpexcel-auto-size-column-width

It's also probably bad to autosize some columns, like the task "Description" column, which might be "MMMMM..." and run the rest of the sheet into the shadow realm off the right hand side of the page. But maybe we could "autosize, with some maximum" or something.

It generally feels like we're probably past the point of diminishing returns to me since I think this behavior isn't bad, but maybe I'm overestimating the peril of trying the autosize stuff.

src/infrastructure/export/PhabricatorExcelExportFormat.php
29–31

When you use a symbol that doesn't exist, arc lint warns you that you used Excarpteen and probably misspelled the class, forgot to update a library map, etc.

This annotation tells lint "don't worry, I know what I'm doing".

56

Without setting widths ourselves, the sheet opens up with every column about one inch wide.

This might be a PHPExcel issue (maybe it writes some explicit default?). I don't recall if I ever tried to dig into it.

129–130

Yeah, we're always guaranteed to have at least one field (since SearchEngine won't export if we don't) and every field has a label and can generate a header, so we should always have a header.

src/infrastructure/export/PhabricatorExportField.php
47

This is "roughly a reasonable width for most types of data that we export". I can put it in a constant or something, but nothing else should ever refer to it and it's still inherently kind of magical.

src/infrastructure/export/PhabricatorIDExportField.php
11

This is just "a reasonable size that the maximum ID number any install is likely to have will fit into".

src/infrastructure/export/PhabricatorIntExportField.php
22

And this one is "most integers are probably, like, smaller than about a million".

src/infrastructure/export/PhabricatorPHIDExportField.php
7

"The default upstream implementation of generateNewPHID generates 30-character PHIDs and they fit okay into this."

This revision was automatically updated to reflect the committed changes.