Page MenuHomePhabricator

Support Excel as a data export format

Authored by epriestley on Jan 29 2018, 3:22 PM.
Referenced Files
F13320704: D18955.diff
Fri, Jun 14, 5:53 AM
F13309650: D18955.diff
Mon, Jun 10, 10:26 AM
F13307277: D18955.id45479.diff
Sun, Jun 9, 4:12 PM
F13275849: D18955.diff
Fri, May 31, 5:15 AM
F13196427: D18955.diff
May 12 2024, 11:15 PM
F13184156: D18955.id45479.diff
May 10 2024, 3:52 PM
Unknown Object (File)
May 6 2024, 4:31 AM
Unknown Object (File)
May 3 2024, 4:07 AM



Depends on D18954. Ref T13049. This brings over the existing Maniphest Excel export pipeline in a generic way.

The <Type>ExportField classes know directly that PHPExcel exists, which is a little sketchy, but writing an Excel indirection layer sounds like a lot of work and I don't anticipate us changing Excel backends anytime soon, so trying to abstract this feels YAGNI.

This doesn't bring over the install instructions for PHPExcel or the detection of whether or not it exists. I'll bring that over in a future change.

Test Plan

Exported users as Excel, opened them up, got a sensible-looking Excel sheet.

Diff Detail

rP Phabricator
Lint Not Applicable
Tests Not Applicable

Event Timeline

amckinley added inline comments.

This is off-topic, but what does this annotation do?


Excel doesn't do something sensible by default, like "set column width to the maximum width present in the data"?


Are we guaranteed to always have a header? I guess since the base class provides id and PHID, even janky extensions should have at least those fields available.


Magic constant?


More magic?

This revision is now accepted and ready to land.Jan 29 2018, 9:30 PM

Yeah, all the widths are sort of inherently magic.

I think they're in some kind of "Excel Unit" and it depends on font choices anyway and I'm looking at these sheets in Numbers rather than Excel, so who knows.

If we don't set anything, the sheet opens up with every column very narrow so almost all data is hidden, so my general goal is just to give the columns a bit more breathing room so the sheet is a bit easier to work with. Users can always resize columns, the default behavior is just particularly unfriendly.

Here's the first sheet I exported before tweaking stuff, in Numbers:

Screen Shot 2018-01-29 at 1.46.19 PM.png (1×1 px, 299 KB)

Here's where stuff ended up:

Screen Shot 2018-01-29 at 1.47.16 PM.png (1×1 px, 509 KB)

A totally ideal approach which actually measured column widths and fit columns to data would produce a different result than that, but I think we're like 80% of the way there, and I'm not sure we have a practical way to do font metrics and fitting.

Some Googling suggests there's stuff we could maybe try, but that it's all guesses/estimates and may imply a large performance cost, e.g.,

It's also probably bad to autosize some columns, like the task "Description" column, which might be "MMMMM..." and run the rest of the sheet into the shadow realm off the right hand side of the page. But maybe we could "autosize, with some maximum" or something.

It generally feels like we're probably past the point of diminishing returns to me since I think this behavior isn't bad, but maybe I'm overestimating the peril of trying the autosize stuff.


When you use a symbol that doesn't exist, arc lint warns you that you used Excarpteen and probably misspelled the class, forgot to update a library map, etc.

This annotation tells lint "don't worry, I know what I'm doing".


Without setting widths ourselves, the sheet opens up with every column about one inch wide.

This might be a PHPExcel issue (maybe it writes some explicit default?). I don't recall if I ever tried to dig into it.


Yeah, we're always guaranteed to have at least one field (since SearchEngine won't export if we don't) and every field has a label and can generate a header, so we should always have a header.


This is "roughly a reasonable width for most types of data that we export". I can put it in a constant or something, but nothing else should ever refer to it and it's still inherently kind of magical.


This is just "a reasonable size that the maximum ID number any install is likely to have will fit into".


And this one is "most integers are probably, like, smaller than about a million".


"The default upstream implementation of generateNewPHID generates 30-character PHIDs and they fit okay into this."

This revision was automatically updated to reflect the committed changes.