Upgrading: Calendar Storage Migrations
Closed, ResolvedPublic

Description

We've recently made some changes to the internal storage representation of Calendar events. These changes primarily accommodate a broader range of date/time specifications and recurrence rules so we can represent (almost) any event which can be represented by an .ics file, which will let us import most ICS files in the future (see T10747).

(The features we do not plan to support are mostly for product reasons: for example, ICS files can represent events which occur once per second, but there is no real use case for these events in most calendaring software and other calendaring applications generally do not let you define them.)

If everything goes well, these changes should have no visible impact. They have been tested reasonably thoroughly, but the migrations are complex (especially the migration in D16664) and calendar data is also complex. It's possible that I missed some cases and that the migrations will incorrectly move/reschedule/corrupt existing events or cause other bugs.

Let us know if you notice anything suspicious: although Calendar is still a prototype, we don't expect to mangle any data more than strictly necessary. The migrations leave the old data behind for now and it is likely that we can fix any issues which are discovered in the relatively near future in a followup migration.

If your environment facilitates it and you have Calendar data in use, you may want to save a backup first or try these migrations against test data.

These should be the last major changes we need to make to Calendar event storage before we unprototype the application.

That's some pretty awesome support for an unsupported prototype. Kudos!

epriestley moved this task from Backlog to Unprototype (v1) on the Calendar board.Oct 10 2016, 4:13 PM
20after4 added a comment.EditedOct 12 2016, 8:41 AM

After running the migrations, some instances of recurring events end up with null for the utcInstanceEpoch column. I'm not sure how this happened, however, as some of the instances have a value that matches utcInitialEpoch and I'm not entirely sure what the purpose of utcInstanceEpoch would be. If it matches InitialEpoch then it seems kinda useless, as does null...

Here is a dump of relevant columns from my test instance:

idphidinstanceOfEventPHIDutcInitialEpochutcInstanceEpoch
1PHID-CEVT-4iysr4yyvhvm7yfm6fss<null>1466791200<null>
2PHID-CEVT-hywjhsqonixmxrtlvx3kPHID-CEVT-4iysr4yyvhvm7yfm6fss1466964000<null>
3PHID-CEVT-glm5maaudtgwbrpa6rkpPHID-CEVT-4iysr4yyvhvm7yfm6fss1467050400<null>
4PHID-CEVT-msyc6goau2w24alvwbmkPHID-CEVT-4iysr4yyvhvm7yfm6fss1467309600<null>
5PHID-CEVT-efxna7uuqup7s7nj72s6PHID-CEVT-4iysr4yyvhvm7yfm6fss1469642400<null>
6PHID-CEVT-mhwdztl5k4g7r6jgkkxtPHID-CEVT-4iysr4yyvhvm7yfm6fss1468519200<null>
7PHID-CEVT-cbmfn5coeblydmbrr6gyPHID-CEVT-4iysr4yyvhvm7yfm6fss1468346400<null>
8PHID-CEVT-aazsevmcn7mhm2lealdrPHID-CEVT-4iysr4yyvhvm7yfm6fss1468260000<null>
9PHID-CEVT-6p4g3m5pkhjks7vtf5axPHID-CEVT-4iysr4yyvhvm7yfm6fss1468173600<null>
10PHID-CEVT-eko7kceoq66jigtiyfdmPHID-CEVT-4iysr4yyvhvm7yfm6fss1467828000<null>
11PHID-CEVT-io2pq7ar3vksbacrsav6PHID-CEVT-4iysr4yyvhvm7yfm6fss1467914400<null>
12PHID-CEVT-5kxlqu4mlgy6twwumptqPHID-CEVT-4iysr4yyvhvm7yfm6fss1468000800<null>
13PHID-CEVT-tgud7hywmdefkjcrlzgbPHID-CEVT-4iysr4yyvhvm7yfm6fss1468087200<null>
14PHID-CEVT-k5xubunsioe74te7enujPHID-CEVT-4iysr4yyvhvm7yfm6fss1467568800<null>
15PHID-CEVT-aquxfkeall4mz7l2qmxvPHID-CEVT-4iysr4yyvhvm7yfm6fss1467655200<null>
16PHID-CEVT-amau4cwhlqvhtdzkqewhPHID-CEVT-4iysr4yyvhvm7yfm6fss1467741600<null>
17PHID-CEVT-e5xpbmh35mq6z4hqmq5vPHID-CEVT-4iysr4yyvhvm7yfm6fss1468605600<null>
18PHID-CEVT-az42bwe3tvrhbncg7rvyPHID-CEVT-4iysr4yyvhvm7yfm6fss1469124000<null>
19PHID-CEVT-rxgegoklzft4x6rruaz5PHID-CEVT-4iysr4yyvhvm7yfm6fss1469296800<null>
20PHID-CEVT-za4cy565pcqkicwcydj6PHID-CEVT-4iysr4yyvhvm7yfm6fss1469210400<null>
21PHID-CEVT-rdc4atofqksbcmww2ge2PHID-CEVT-4iysr4yyvhvm7yfm6fss1468951200<null>
22PHID-CEVT-wbggzsy2mswvqryc2wba<null>1474956000<null>
23PHID-CEVT-ouwzm4ugxat6n6gcdtxgPHID-CEVT-wbggzsy2mswvqryc2wba14750424001475042400
24PHID-CEVT-wjfj3yl2f4nbjfob6smiPHID-CEVT-wbggzsy2mswvqryc2wba1475128800<null>
25PHID-CEVT-7l4e5f47rhwtxvuxllnmPHID-CEVT-wbggzsy2mswvqryc2wba1475215200<null>
26PHID-CEVT-vo2hgrpfyllsx5t7nbckPHID-CEVT-wbggzsy2mswvqryc2wba1475301600<null>
27PHID-CEVT-oc6ciitk4o3koexyk6mqPHID-CEVT-wbggzsy2mswvqryc2wba14762520001476252000
28PHID-CEVT-meuxhcoloea3rwdagvzfPHID-CEVT-wbggzsy2mswvqryc2wba14761656001476165600
29PHID-CEVT-3lioposdhvdysgq3uc7nPHID-CEVT-wbggzsy2mswvqryc2wba14760792001476079200
30PHID-CEVT-4nfr2yr3s4h754xmloctPHID-CEVT-wbggzsy2mswvqryc2wba14757336001475733600
31PHID-CEVT-l6z3gi6vdzbngocsoxt2PHID-CEVT-wbggzsy2mswvqryc2wba14758200001475820000
32PHID-CEVT-z4tlx5ux22mpwobv4p5kPHID-CEVT-wbggzsy2mswvqryc2wba14759064001475906400
33PHID-CEVT-6urdv2g67kwy2flfdzi3PHID-CEVT-wbggzsy2mswvqryc2wba14759928001475992800
34PHID-CEVT-gs2obvrkd375oqhq5ehfPHID-CEVT-wbggzsy2mswvqryc2wba14753880001475388000
35PHID-CEVT-aq3t7ypvukxgplqv25vcPHID-CEVT-wbggzsy2mswvqryc2wba14754744001475474400
36PHID-CEVT-dnhobold4esyuge4dtckPHID-CEVT-wbggzsy2mswvqryc2wba14755608001475560800
37PHID-CEVT-jw5uffwqhxplfmya45qbPHID-CEVT-wbggzsy2mswvqryc2wba14756472001475647200

FWIW, the only problem this seems to be causing is with ics export which fails at

PhutilCalendarAbsoluteDateTime::newFromEpoch

because it's called with a null argument from:

phabricator/src/applications/calendar/storage/PhabricatorCalendarEvent.php:617

Would it be safe for me to write a migration that sets utcInstanceEpoch equal to utcInitialEpoch on every row in the calendar_event table?

epriestley added a comment.EditedOct 12 2016, 4:07 PM

It's safe to do that (set utcInstanceEpoch = utcInitialEpoch for all events with a parentPHID where utcInstanceEpoch is currently NULL), with one minor caveat.

The utcInstanceEpoch and utcInitialEpoch will differ only if the event is a recurring event which has been rescheduled. In this case, utcInitialEpoch will hold the rescheduled time (in UTC) while utcInstanceEpoch will hold the original time (in UTC).

So doing that update might get instance epochs wrong for recurring events which have been rescheduled. However, this shouldn't really matter because utcInstanceEpoch is only really used to deduplicate events imported from external sources, and Phabricator-originated events should normally not be re-imported to the same install. If any of the values are wrong, I think it would only cause a problem if someone downloaded the ICS file for the event, then uploaded it again. We may simply disallow this (decline to import events which we already have the authoritative copy of) since it's probably a big mess of mistakes and user error and nonsense anyway.

Broader context is that .ics files define events with two fields:

UID: <some unique string>
RECURRENCE-ID: 20161112T120000Z

For recurring events, all events have the same UID, and the different instances are differentiated by RECURRENCE-ID. I believe .ics does this (instead of identifying them with a sequence number, like "instance #3", as we generally do in the UI) because it also allows arbitrary additional instances to be added (via RDATE), so users can say "this event repeats every week, and also four times this Thursday" and suddenly you have 4 events between "instance 6" and "instance 7". I plan to accommodate this but keep instance numbers in the UI where possible, since they're a little easier for humans to deal with and the RDATE stuff should be rare.

My best guess about these events is that they might have been created by clicking "Next >" while there was still that bug that let you keep clicking it and ignored "Repeat Until"? If so, the new code would refuse to generate an instance epoch for the invalid sequence indexes, and we could end up with null. Not sure if that's actually what happened, but it's my best guess at a plausible explanation.

epriestley closed this task as Resolved.Oct 19 2016, 3:25 PM

I'm going to close this since things seem to have at least mostly worked and installs have had a couple of weeks to yell at us if we blew anything major up, but let us know if anyone runs into further issues.