Page MenuHomePhabricator

Calendar: ICS Import Errata
Closed, ResolvedPublic

Assigned To
Authored By
epriestley
Oct 30 2016, 5:56 PM
Referenced Files
F2064636: admin (2).ics
Dec 2 2016, 11:53 AM
F1909937: large.jpg
Nov 8 2016, 1:51 AM
F1909933: Best-Thread-Ever.gif
Nov 8 2016, 1:50 AM
F1909914: here-it-goes.jpg
Nov 8 2016, 1:40 AM
F1909916: out.xhprof
Nov 8 2016, 1:40 AM
F1909353: Screen Shot 2016-11-07 at 10.56.40 AM.png
Nov 7 2016, 7:00 PM
F1905629: IMG_0549.jpg
Nov 5 2016, 5:32 PM
F1905622: IMG_0525.jpg
Nov 5 2016, 5:32 PM
Tokens
"Baby Tequila" token, awarded by chad.

Description

Calendar is still a prototype application. Normally, we don't accept bug reports for prototype applications. However, as an exception, we will now accept bug reports related to ICS imports in this task.

For details on using ICS imports, see Calendar User Guide: Importing Events. At the time of writing (Oct 30, 2016) the import features are only available in master.

ICS is a complex format which we attempt to parse relatively faithfully. We probably get some stuff wrong. If you encounter an error with importing an .ics file, you can note it here and we'll try to fix it. See also T10747 for earlier context.

Helpful information when reporting an import issue:

  • The .ics file which causes the error -- but note that .ics files contain sensitive information about meetings and events, including attendee email addresses. If possible, create a similar event with dummy/public data that reproduces the issue and give us that instead.
  • Any error messages produced in the import log.
  • General context on what you were trying to do, what you expected to happen, what the event looks like in the original application, etc.

NOT SUPPORTED

This is a list of features which we are explicitly declining to support at this time. You should receive a useful error in the import log if you attempt to import events which use these features:

  • Recurring events with frequency SECONDLY, MINUTELY or HOURLY. These frequencies are silly for normal calendaring applications and computationally expensive to support
  • Exporting events, then importing them into the same copy of Phabricator again.
  • Events before 1970 or after 2037. (We'll add support for these within the next 20 years.)

Event Timeline

Mentioned in chat but just to keep it logged. Importing an ics file with the following RRULE (RRULE:FREQ=WEEKLY;BYDAY=TH;INTERVAL=2) works as expected opening in Microsoft Outlook, but gives the following error in Phabricator upon import "RRULE INTERVAL "2" is invalid: interval must be an integer."

As @epriestley said in the general chat "We're missing an (int) cast in PhutilRecurrenceRule::newFromRRULE() on INTERVAL and probably COUNT".

Mentioned in chat but just to keep it logged. Importing an ics file with the following RRULE (RRULE:FREQ=WEEKLY;BYDAY=TH;INTERVAL=2) works as expected opening in Microsoft Outlook, but gives the following error in Phabricator upon import "RRULE INTERVAL "2" is invalid: interval must be an integer."

As @epriestley said in the general chat "We're missing an (int) cast in PhutilRecurrenceRule::newFromRRULE() on INTERVAL and probably COUNT".

I think it was fixed with 6d6308803540.

@rabahmeradi So it was, shortly afer reporting! Thanks.

(Moved from T11816#199040)

Running bd3233d3ab0c0f8f48df7a46b57e9f70bf836ec8


I successfully modified the policies of Calendar!


I updated my test instance and went back to the ICS File Import that I setup last night. From there I clicked on Reload ImportReload Events. Clicking "Reload Events" didn't seem to have any effect (no visual feedback). It's possible that it was taking a really long time (my ICS file is ~3.3mb). I checked the logs and it looks like things were happening:

[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message: [2016-11-04 20:36:40] ERROR 2: sha1() expects parameter 1 to be string, object given at [/usr/local/phacility/phabricator/src/infrastructure/util/PhabricatorHash.php:60]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message: arcanist(head=master, ref.master=fad85844314b), phabricator(head=master, ref.master=bd3233d3ab0c, custom=3), phutil(head=master, ref.master=e409df2720c2)"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #0 sha1(PhutilCalendarUserNode, boolean) called at [<phabricator>/src/infrastructure/util/PhabricatorHash.php:60]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #1 PhabricatorHash::digestForIndex(PhutilCalendarUserNode) called at [<phabricator>/src/applications/calendar/query/PhabricatorCalendarExternalInviteeQuery.php:53]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #2 PhabricatorCalendarExternalInviteeQuery::buildWhereClauseParts(AphrontMySQLiDatabaseConnection) called at [<phabricator>/src/infrastructure/query/policy/PhabricatorCursorPagedPolicyAwareQuery.php:273]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #3 PhabricatorCursorPagedPolicyAwareQuery::buildWhereClause(AphrontMySQLiDatabaseConnection) called at [<phabricator>/src/infrastructure/query/policy/PhabricatorCursorPagedPolicyAwareQuery.php:96]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #4 PhabricatorCursorPagedPolicyAwareQuery::loadStandardPageRows(PhabricatorCalendarExternalInvitee) called at [<phabricator>/src/infrastructure/query/policy/PhabricatorCursorPagedPolicyAwareQuery.php:82]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #5 PhabricatorCursorPagedPolicyAwareQuery::loadStandardPage(PhabricatorCalendarExternalInvitee) called at [<phabricator>/src/applications/calendar/query/PhabricatorCalendarExternalInviteeQuery.php:30]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #6 PhabricatorCalendarExternalInviteeQuery::loadPage() called at [<phabricator>/src/infrastructure/query/policy/PhabricatorPolicyAwareQuery.php:236]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #7 PhabricatorPolicyAwareQuery::execute() called at [<phabricator>/src/applications/calendar/import/PhabricatorCalendarImportEngine.php:263]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #8 PhabricatorCalendarImportEngine::importEventDocument(PhabricatorUser, PhabricatorCalendarImport, PhutilCalendarRootNode) called at [<phabricator>/src/applications/calendar/import/PhabricatorCalendarICSImportEngine.php:42]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #9 PhabricatorCalendarICSImportEngine::importICSData(PhabricatorUser, PhabricatorCalendarImport, string) called at [<phabricator>/src/applications/calendar/import/PhabricatorCalendarICSFileImportEngine.php:86]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #10 PhabricatorCalendarICSFileImportEngine::importEventsFromSource(PhabricatorUser, PhabricatorCalendarImport) called at [<phabricator>/src/applications/calendar/editor/PhabricatorCalendarImportEditor.php:57]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #11 PhabricatorCalendarImportEditor::applyFinalEffects(PhabricatorCalendarImport, array) called at [<phabricator>/src/applications/transactions/editor/PhabricatorApplicationTransactionEditor.php:969]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #12 PhabricatorApplicationTransactionEditor::applyTransactions(PhabricatorCalendarImport, array) called at [<phabricator>/src/applications/calendar/controller/PhabricatorCalendarImportReloadController.php:37]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #13 PhabricatorCalendarImportReloadController::handleRequest(AphrontRequest) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:257]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #14 AphrontApplicationConfiguration::processRequest(AphrontRequest, PhutilDeferredLog, AphrontPHPHTTPSink, MultimeterControl) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:169]"
[04-Nov-2016 20:36:40] WARNING: [pool www] child 11015 said into stderr: "NOTICE: PHP message:   #15 AphrontApplicationConfiguration::runHTTPRequest(AphrontPHPHTTPSink) called at [<phabricator>/webroot/index.php:17]"

I tried instead creating a new Import and uploading again. This time I got a timeout:

>>> UNRECOVERABLE FATAL ERROR <<<

Maximum execution time of 30 seconds exceeded

/usr/local/phacility/phabricator/src/applications/people/storage/PhabricatorUser.php:431


┻━┻ ︵ ¯\_(ツ)_/¯ ︵ ┻━┻

In the php-fpm logs there are a bunch of these (seems to be related but slightly different stacktrace):

[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message: [2016-11-04 20:29:14] ERROR 2: sha1() expects parameter 1 to be string, object given at [/usr/local/phacility/phabricator/src/infrastructure/util/PhabricatorHash.php:60]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message: arcanist(head=master, ref.master=fad85844314b), phabricator(head=master, ref.master=bd3233d3ab0c, custom=3), phutil(head=master, ref.master=e409df2720c2)"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #0 sha1(PhutilCalendarUserNode, boolean) called at [<phabricator>/src/infrastructure/util/PhabricatorHash.php:60]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #1 PhabricatorHash::digestForIndex(PhutilCalendarUserNode) called at [<phabricator>/src/applications/calendar/query/PhabricatorCalendarExternalInviteeQuery.php:53]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #2 PhabricatorCalendarExternalInviteeQuery::buildWhereClauseParts(AphrontMySQLiDatabaseConnection) called at [<phabricator>/src/infrastructure/query/policy/PhabricatorCursorPagedPolicyAwareQuery.php:273]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #3 PhabricatorCursorPagedPolicyAwareQuery::buildWhereClause(AphrontMySQLiDatabaseConnection) called at [<phabricator>/src/infrastructure/query/policy/PhabricatorCursorPagedPolicyAwareQuery.php:96]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #4 PhabricatorCursorPagedPolicyAwareQuery::loadStandardPageRows(PhabricatorCalendarExternalInvitee) called at [<phabricator>/src/infrastructure/query/policy/PhabricatorCursorPagedPolicyAwareQuery.php:82]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #5 PhabricatorCursorPagedPolicyAwareQuery::loadStandardPage(PhabricatorCalendarExternalInvitee) called at [<phabricator>/src/applications/calendar/query/PhabricatorCalendarExternalInviteeQuery.php:30]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #6 PhabricatorCalendarExternalInviteeQuery::loadPage() called at [<phabricator>/src/infrastructure/query/policy/PhabricatorPolicyAwareQuery.php:236]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #7 PhabricatorPolicyAwareQuery::execute() called at [<phabricator>/src/applications/calendar/import/PhabricatorCalendarImportEngine.php:263]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #8 PhabricatorCalendarImportEngine::importEventDocument(PhabricatorUser, PhabricatorCalendarImport, PhutilCalendarRootNode) called at [<phabricator>/src/applications/calendar/import/PhabricatorCalendarICSImportEngine.php:42]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #9 PhabricatorCalendarICSImportEngine::importICSData(PhabricatorUser, PhabricatorCalendarImport, string) called at [<phabricator>/src/applications/calendar/import/PhabricatorCalendarICSFileImportEngine.php:86]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #10 PhabricatorCalendarICSFileImportEngine::importEventsFromSource(PhabricatorUser, PhabricatorCalendarImport) called at [<phabricator>/src/applications/calendar/editor/PhabricatorCalendarImportEditor.php:57]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #11 PhabricatorCalendarImportEditor::applyFinalEffects(PhabricatorCalendarImport, array) called at [<phabricator>/src/applications/transactions/editor/PhabricatorApplicationTransactionEditor.php:969]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #12 PhabricatorApplicationTransactionEditor::applyTransactions(PhabricatorCalendarImport, array) called at [<phabricator>/src/applications/transactions/editengine/PhabricatorEditEngine.php:1001]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #13 PhabricatorEditEngine::buildEditResponse(PhabricatorCalendarImport) called at [<phabricator>/src/applications/transactions/editengine/PhabricatorEditEngine.php:896]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #14 PhabricatorEditEngine::buildResponse() called at [<phabricator>/src/applications/calendar/controller/PhabricatorCalendarImportEditController.php:27]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #15 PhabricatorCalendarImportEditController::handleRequest(AphrontRequest) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:257]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #16 AphrontApplicationConfiguration::processRequest(AphrontRequest, PhutilDeferredLog, AphrontPHPHTTPSink, MultimeterControl) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:169]"
[04-Nov-2016 20:29:14] WARNING: [pool www] child 11018 said into stderr: "NOTICE: PHP message:   #17 AphrontApplicationConfiguration::runHTTPRequest(AphrontPHPHTTPSink) called at [<phabricator>/webroot/index.php:17]"

I did figure out how to export a single recent recurring event from Zimbra and it successfully imported into Phabricator! It looks like it even listed the invitee who doesn't have a Phab account, which is awesome. I tested two recurring events, one created internally at Zimbra and one created externally (Outlook probably) I was invited to. Both recurring events were created properly (one event already passed, one yet to come).

It looks like on the one event the reserved room resource (handled by Zimbra) is listed as "Private User 1". The relevant ICS bits I found were:

LOCATION:meetingroom@company.com
ATTENDEE;CUTYPE=RESOURCE;ROLE=NON-PARTICIPANT;PARTSTAT=ACCEPTED:mailto:meeti
 ngroom@company.com

This is probably expected, though maybe it could be listed as a non-participant/resource/location instead of user?

Actually I just tried another meeting and the meeting room showed up as a user with a name. Relevant ICS:

LOCATION:"Meeting Room" <meetingroom@company.com>
ATTENDEE;CN=Meeting Room;CUTYPE=RESOURCE;ROLE=NON-PARTICIPANT;PARTSTAT=ACCEP
 TED:mailto:meetingroom@company.com

Additionally it doesn't look like optional attendees were imported/displayed. The ICS looks like:

ATTENDEE;CN=OptionalUser;CUTYPE=INDIVIDUAL;ROLE=OPT-PARTICIPANT;PARTSTAT=NEE
 DS-ACTION;RSVP=TRUE:mailto:optional@company.com

Additionally in the Import itself for all events I imported successfully there's some "Ignored Node" which I'm not sure what it would be referring to:

Screen Shot 2016-11-04 at 8.46.18 PM.png (48×960 px, 14 KB)


I tried to import via URL instead of file and got an error that looks related to networking (and probably configuration on our end which I'll have to get IT to look at). I just thought I'd include it as I've not come across it before.

URI "sekret" is not a valid fetchable resource. The domain "sekret" resolves to the address "x.x.x.x", which is blacklisted for outbound requests.

Thanks!

  • D16804 should resolve the sha1() issue.
  • That may also resolve the timeout: the sha1() issue made us take a much slower code path. But I might also need to optimize the import and make it fall back to occurring in the background for large .ics files. Let me know if you're still seeing timeouts after the sha1() fix.
  • When a user is only an email address, we currently show "Private User 1" instead of the address. We could just show the address, or could show some "totally impossible to guess" address like me.......om@mycompany.com, but I wanted to err on the side of caution about disclosing email addresses, at least to start with. We're fairly conservative about this in other parts of the product and it's hard to put the cat back in the bag if we consider this information open. In Phabricator, it's easier to import an .ics file and expose the data to all users than it is in more private-by-default / personally-focused applications with defaults aimed more toward individual users, which also gives me some pause. But this may get in the way of usability too much, so I could see changing this behavior. If a user has a display name ("Meeting Room" <meetingroom@company.com>), we should show them as "Meeting Room", we just avoid showing meetingroom@company.com if that's the only name we have.
  • I'll look at the OPT-PARTICIPANT thing, I would expect them to import but just not have an "optional" marker right now. To clarify, they didn't import at all, right?
  • The "Ignored Node" stuff is likely fine, things like VALARM sections inside VEVENT sections. At some point I'll make the error message more helpful ("Ignored a VALARM section; Phabricator does not send reminders for imported events.") but probably OK to ignore those warnings for now.
  • URI stuff is because security.outbound-blacklist defaults to blacklisting all private/internal IP space, to prevent attackers from discovering and interacting with private services on your network by using features like image proxying, macros, calendar import, etc. If you want to fetch data from a host in private IP space, you'll need to remove its address from the blacklist (or expose an alternate interface for it in public IP space).

I wasn't immediately able to reproduce the "OPT-PARTICIPANT" issue -- here's the file I used, trying to mimic your test case:

Here's the event I got out of it, showing "OptionalUser":

Screen Shot 2016-11-05 at 9.42.19 AM.png (529×731 px, 63 KB)

Did I misunderstand, and you just meant that some kind of "this user is optional" marker was expected, but missing?

If a user has a display name ("Meeting Room" <meetingroom@company.com>), we should show them as "Meeting Room", we just avoid showing meetingroom@company.com if that's the only name we have.

I think this is good default behavior and what I was expecting regarding not showing email address. My earlier comment was more about whether it could be shown as a Resource instead of an attendee (due to CUTYPE=RESOURCE;ROLE=NON-PARTICIPANT), or possibly the Location as the email matches what the LOCATION field is set as -- instead of "Private User" invitee. This sounds more like a new feature request and not a bug/issue - should I make a separate task?

Also thank you for the information about the blocking of internal IP space. I wasn't sure if it was a Phabricator configuration or system configuration that would need looked at.

Did I misunderstand, and you just meant that some kind of "this user is optional" marker was expected, but missing?

I wasn't expecting a marker or anything, the users didn't seem to import at all. I will need to investigate the event more to find out what's going on. It was one of the oldest ongoing meetings I have scheduled that's been modified several times.

My calendar is 7 years of gunk that I'd figure I could throw at Phabricator for a decent test case. For amusement here are just a few screenshots I've taken over the years when things have behaved strangely~

IMG_0525.jpg (1×640 px, 63 KB)

IMG_0549.jpg (1×640 px, 84 KB)

Haha, nice.

T11399 has some discussion of "Location" for events -- we could possibly implement a little piece of that now to split out resources from normal attendees, but my initial inclination is to wait until we do a real implementation of locations.

I tried reloading the import again and still hit the 30 second timeout. There's no indication in the web page UI that anything happens but I checked the logs. Is this just expected for a large import?

I would expect that we could parse a 3MB .ics file in less than 30 seconds, but obviously there's nothing to prevent you from uploading a 30MB or 300MB or 3GB .ics file, and at some point we'll take longer than 30 seconds to process it.

I'll change the web workflows to queue ICS files over some reasonably conservative limit for background processing. We already queue ICS files fetched via URI, so this should be fairly straightforward.

We can probably improve import performance too, but I assume your 3MB file is full of tons and tons of private stuff, and I'm hesitant to try to just guess what might be slow.


When stuff takes more than 30 seconds:

PHP has a builtin time limit (max_execution_time) which defaults to 30 seconds. If that limit is configured and we hit it, execution will cease.

Exactly what you see in the browser depends on the type of request you've made and various other configuration PHP/webserver configuration. We have a limited ability to respond when termination stops because of this time limit.

The most common result is a white page with the text "Maximum execution time of 30 seconds exceeded in...". However, if the page that timed out was an AJAX request (for example, because you dragged and dropped the file into the Calendar UI, or submitted a form which uses the AJAX "workflow" mechanism internally) that response will be sent to the browser over AJAX, and it will just fail in your Javascript error log.

So that behavior is expected if:

  • you have max_execution_time configured; and
  • the page reached the execution time limit; and
  • various other PHP configuration is configured in a normal way; and
  • you submitted an AJAX request:
    • dragging and dropping the ICS file into the Calendar UI is AJAX.
    • browsing to an import, clicking "Reload Import" with your left mouse button, and submitting the dialog from the same page is AJAX.
    • browsing to CalendarImportsImport Events and following that flow is NOT AJAX (today; it may become AJAX in the future to let PHUIFormFileControl use a fancier transfer mode).
    • command-clicking on the "Reload Import" button, or right-clicking the "Reload Import" button and selecting "Open in New Tab" or "Open in New Window", then clicking the button from the standalone page is NOT AJAX.
    • doing any of this with the persistent chat pane open may become AJAX in the near-ish future (T7912).
    • doing anything at all may become AJAX in the more distant future (Quicksand).

If the request was AJAX, the expected behavior is generally a browser-dependent error message in your browser/Javascript error log.

If the request was not AJAX, the expected behavior is generally a white page with a plain text "Maximum execution time..." error message.

However, both of these behaviors may potentially be modified by other PHP or webserver configuration.

Hi,

I was trying to import the french holidays from google ICS file : https://calendar.google.com/calendar/ical/fr.french%23holiday%40group.v.calendar.google.com/public/basic.ics

For some reason, every event imported was shown as a 3 day event whereas all event are only 1 day.

I'm on master, updated this morning.

Thank you for clarifying the behavior of the timeout. It wasn't clear to me whether I would wait for feedback after clicking the Reload button or begin refreshing the page. After the change to queuing of import the events were imported. I'm doing all this testing on a test instance of Phabricator so I would be able to add log/debug statements and re-import the calendar again to track down what might be slow. The import pulled in ~1700 events in ~3 minutes.

I haven't yet taken a deep dive into the optional invitees issue on the recurring event. The event started early in the year and at some point mid-year I removed them from future events. I checked that the event in Phabricator doesn't list them for past events imported. Maybe it's not related to being optional but the fact that the attendee list changed for a recurring event?

From importing my calendar there were a handful of warnings, not sure if they're meaningful though as they seem to uncover issues with my calendar more than issues with Phabricator:

2455	Calendar Import 1: ImportFile		Queued	Queued for background import: data size (3 MB) exceeds limit for immediate processing (512 KB).	Mon, Nov 7, 9:25 AM
2456	Calendar Import 1: ImportFile		Import Triggered	Started background processing.	Mon, Nov 7, 9:25 AM
2457	Calendar Import 1: ImportFile		ICS Parser Warning	Warning ("warn-tzid-guess") while parsing ICS data (near line 1143): TZID "(GMT-05.00) Auto-Detected" is unknown, guessing "America/Atikokan" based on pattern "GMT-05.00".	Mon, Nov 7, 9:25 AM
2458	Calendar Import 1: ImportFile		ICS Parser Warning	Warning ("warn-tzid-guess") while parsing ICS data (near line 1144): TZID "(GMT-05.00) Auto-Detected" is unknown, guessing "America/Atikokan" based on pattern "GMT-05.00".	Mon, Nov 7, 9:25 AM
2459	Calendar Import 1: ImportFile		ICS Parser Warning	Warning ("warn-tzid-ignored") while parsing ICS data (near line 3542): TZID "Etc/GMT" is unknown, using UTC instead.	Mon, Nov 7, 9:25 AM
2460	Calendar Import 1: ImportFile		ICS Parser Warning	Warning ("warn-tzid-ignored") while parsing ICS data (near line 3543): TZID "Etc/GMT" is unknown, using UTC instead.	Mon, Nov 7, 9:25 AM
2461	Calendar Import 1: ImportFile		ICS Parser Warning	Warning ("warn-tzid-ignored") while parsing ICS data (near line 27284): TZID "Etc/GMT" is unknown, using UTC instead.	Mon, Nov 7, 9:25 AM
2462	Calendar Import 1: ImportFile		ICS Parser Warning	Warning ("warn-tzid-ignored") while parsing ICS data (near line 27285): TZID "Etc/GMT" is unknown, using UTC instead.	Mon, Nov 7, 9:25 AM
2463	Calendar Import 1: ImportFile		ICS Parser Warning	Warning ("warn-tzid-ignored") while parsing ICS data (near line 30123): TZID "Etc/GMT" is unknown, using UTC instead.	Mon, Nov 7, 9:25 AM
2464	Calendar Import 1: ImportFile		ICS Parser Warning	Warning ("warn-tzid-ignored") while parsing ICS data (near line 30124): TZID "Etc/GMT" is unknown, using UTC instead.	Mon, Nov 7, 9:25 AM
2465	Calendar Import 1: ImportFile		ICS Parser Warning	Warning ("warn-tzid-ignored") while parsing ICS data (near line 46822): TZID "Etc/GMT" is unknown, using UTC instead.	Mon, Nov 7, 9:25 AM
2466	Calendar Import 1: ImportFile		ICS Parser Warning	Warning ("warn-tzid-ignored") while parsing ICS data (near line 46823): TZID "Etc/GMT" is unknown, using UTC instead.	Mon, Nov 7, 9:25 AM
2467	Calendar Import 1: ImportFile		Ignored Node	Ignored unsupported "raw" node present in source.	Mon, Nov 7, 9:25 AM
2468	Calendar Import 1: ImportFile		Ignored Node	Ignored unsupported "raw" node present in source.	Mon, Nov 7, 9:25 AM
2469	Calendar Import 1: ImportFile		Ignored Node	Ignored unsupported "raw" node present in source.	Mon, Nov 7, 9:25 AM
2470	Calendar Import 1: ImportFile		Ignored Node	Ignored unsupported "raw" node present in source.	Mon, Nov 7, 9:25 AM
2471	Calendar Import 1: ImportFile		Ignored Node	Ignored unsupported "raw" node present in source.	Mon, Nov 7, 9:25 AM
2472	Calendar Import 1: ImportFile		Ignored Node	Ignored unsupported "raw" node present in source.	Mon, Nov 7, 9:25 AM
2473	Calendar Import 1: ImportFile		Ignored Node	Ignored unsupported "raw" node present in source.	Mon, Nov 7, 9:25 AM
2474	Calendar Import 1: ImportFile		Ignored Node	Ignored unsupported "raw" node present in source.	Mon, Nov 7, 9:25 AM
2475	Calendar Import 1: ImportFile		Ignored Node	Ignored unsupported "raw" node present in source.	Mon, Nov 7, 9:25 AM
2476	Calendar Import 1: ImportFile		Out of Range	Ignored an event with an out-of-range date. Only dates between 1970 and 2037 are supported.	Mon, Nov 7, 9:25 AM
2477	Calendar Import 1: ImportFile		Orphan	Found orphaned child event ("040000008200E00074C5B7101A82E00800000000C0129B110DBFD0010000000000000000100000009D72DD5AC5C56449BB92CD68DB465F98/1445616000") without a parent event ("040000008200E00074C5B7101A82E00800000000C0129B110DBFD0010000000000000000100000009D72DD5AC5C56449BB92CD68DB465F98/").	Mon, Nov 7, 9:25 AM
2478	Calendar Import 1: ImportFile		Orphan	Found orphaned child event ("1e4e32bf-dbd6-45d7-902e-e6fe4ae60bad/1369753200") without a parent event ("1e4e32bf-dbd6-45d7-902e-e6fe4ae60bad/").	Mon, Nov 7, 9:25 AM
2479	Calendar Import 1: ImportFile		Orphan	Found orphaned child event ("1e4e32bf-dbd6-45d7-902e-e6fe4ae60bad/1372172400") without a parent event ("1e4e32bf-dbd6-45d7-902e-e6fe4ae60bad/").	Mon, Nov 7, 9:25 AM
2480	Calendar Import 1: ImportFile		Orphan	Found orphaned child event ("55156e23-5df1-453c-b1a2-895fd247d245/1406545200") without a parent event ("55156e23-5df1-453c-b1a2-895fd247d245/").	Mon, Nov 7, 9:25 AM
2481	Calendar Import 1: ImportFile		Orphan	Found orphaned child event ("6247f0f4-d9f5-4d22-b90c-2f7b056e5926/1376300700") without a parent event ("6247f0f4-d9f5-4d22-b90c-2f7b056e5926/").	Mon, Nov 7, 9:25 AM

I was trying to import the french holidays from google ICS file...

Thanks! That looks like a bug, I'll take a look.


This timezone looks like something we should be able to guess, although using UTC instead is probably correct-enough (I think GMT and UTC differ only in trivial/hypotehtical/nitpicky ways?).

Etc/GMT

The 1970/2037 is probably a legitimate crazy event in 2050 or something, although I don't have an easy way to verify that. We've lost track of which line we were parsing by the time we reach that check. I could probably make the error message have some slightly better hint (like the event name).

I'd guess the orphaned children are real problems with the data given that there are only ~5 in 1,700 events, but it's possible we have some sort of bug there. You could search for events with:

UID: 1e4e32bf-dbd6-45d7-902e-e6fe4ae60bad

From the log, I expect there to be exactly two of them, and for both to have "RECURRENCE-ID" fields. If that's the case, I believe the data is at fault and we're handling it about the best we can. If you find something else, we may have a bug.

I checked for that UID and there are exactly two events, both with "RECURRENCE-ID":

RECURRENCE-ID;TZID="America/New_York":20130528T150000

So this definitely sounds like a bug with how the event was created. I checked the other two events which also both have "RECURRENCE-ID" and no other events with those IDs. I know we had previously found a bug with Zimbra in scheduling a recurring event which spans past the new year to be problematic, and it's likely those events are part of that (all around ~2013).

Looks like the majority of everything else came through though I haven't dug into many event details.

Ah, cool. I think that's a Zimbra issue, then.

I don't think it's illegal to put just an instance of event (UID + RECURRENCE-ID) in an .ics file without the parent event (UID only), but we can't really do anything with it without knowing about the parent, since a lot of information is usually inherited from the parent (particularly, the event duration). If this turns out to be really common we could try to brute force our way through these, but that would be pretty messy and it seems like our behavior is reasonable for now until we learn that Outlook generates these routinely or whatever.

At HEAD, I think the all-day issues with the French holidays above are fixed. You may need to reload the import for the change to take effect, but things seem OK for me locally now:

Screen Shot 2016-11-07 at 10.56.40 AM.png (318×391 px, 11 KB)

Let me know if you're still seeing issues -- the all-day stuff is a bit fiddly at the display layer.


@cspeckmim, if you're feeling particularly ambitious, you can now generate a profile for the import. This may be more than you want to deal with, but basically:

  • Install XHProf (possible documentation).
  • Run bin/calendar reload <id> --xprofile out.xhprof, where <id> is the import ID. This will run about ~2x slower than normal since the profiler is active, so ~6 minutes if it normally takes ~3 minutes.
  • Upload out.xhprof to this server (this is anonymous profiling data).

That may point me in the right direction of improving things. Could also be totally useless.

Here's the XHProf link:

https://secure.phabricator.com/xhprof/profile/PHID-FILE-wi77dncmpf3fbi5jgvl5/

It's more reasonable looking than I'd imagined, we're only spending 14 seconds parsing the ICS file:

https://secure.phabricator.com/xhprof/profile/PHID-FILE-wi77dncmpf3fbi5jgvl5/?symbol=PhutilICSParser::parseICSData

There's a bunch of stuff in the actual parsing that can probably be made cheaper, but even if that 14 seconds drops to 0 we aren't getting much out of it.

About 70% of the time is spent applying transactions; this is probably 30 seconds we can speed up pretty cheaply:

https://secure.phabricator.com/xhprof/profile/PHID-FILE-wi77dncmpf3fbi5jgvl5/?symbol=PhabricatorMarkupEngine::newDifferentialMarkupEngine

This could be batched and made cheaper but I'm not sure it's really worth it:

https://secure.phabricator.com/xhprof/profile/PHID-FILE-wi77dncmpf3fbi5jgvl5/?symbol=PhabricatorWorker::scheduleTask

There's a lot of other stuff that could be faster but I don't see anything super obviously silly going on.

ovo

Thanks for looking into the profile. The queuing I think fixes the issue with speed of importing. The next thing I was going to try was auto-import of the .ics file from a hosted location on hourly basis, and I'm assuming there's no proper way to only query for the newest things (at least though the API in Zimbra I've found) - so Phabricator would have to process the file every request. With queuing it shouldn't matter.

Attached is a single VEVENT that causes "Datetime has no timezone or viewer timezone." in the calendar view after importing, the import itself runs fine.

Thanks, that's extremely helpful. I think D16974 + D16975 should fix the bug (and an adjacent bug I hit while testing the first fix).

ICS issues seem to have calmed down so I'm going to call this resolved; future issues can just be filed as separate tasks.