Page MenuHomePhabricator

Don't clog feed with old commits when importing a repo
Closed, ResolvedPublic

Description

When importing a repository, each of its commits gets put onto the top of my feed. I think they should either be omitted from the feed or they should use the time of commit for positioning (instead of time of import).

Event Timeline

sshannin raised the priority of this task from to Needs Triage.
sshannin updated the task description. (Show Details)
sshannin added a project: Diffusion.
sshannin updated the task description. (Show Details)
sshannin added a subscriber: sshannin.

Maybe turn off feed notification during import?

Generating feed activity for historical commits seems odd to me in general and I don't recall it but maybe that is why.

epriestley triaged this task as Normal priority.Feb 18 2015, 7:33 PM
epriestley added a project: Phacility.
epriestley moved this task from Backlog to v1 Open Beta on the Phacility board.

We don't do a good job with this workflow right now:

  • Create a new empty repository.
  • Push a ton of changes to it.

In particular:

  • We'll send some notifications even with Publish/Notify off.
  • The default behavior for this workflow should be to just import these commits and not go crazy with them, similar to when you import an external repository for the first time.

This likely breaks out as a rule like "don't really mark a repository as imported until it has some commits in it or we get a push where the oldest commit is less than 24 hours old".

I still think that timestamping the feed events in the phab DB based on the timestamp of the commits in the repo is ideal here (although I don't know if it's feasible).

  • This eliminates the initial problem here (flooding feed with old stuff)
  • It actually works nicely for new instances. When you import your active existing repo, the feed shows your recent commits properly.

Basically this just makes the feed agnostic to when you actually hit the import and displays the repo data as if it had always been tied to your phab instance.

I think the value of backdating feed is very small, and the current behavior is inconsistent anyway (not all commits get published). We could possibly consider backdating someday, but we'd probably want other use cases to support building the capability first.

To surface the details of D11827 here for future users, once that lands we'll begin using a heuristic to detect whether a push is an import of an existing repository or the addition of new commits. At the time of writing, the rule identifies a push as an import if:

  • the repository is empty; and
  • the push includes 8 or more commits.

This limit is arbitrary but seems like it may be reasonable.

We do not currently look at commit dates (e.g., are any commits older than 24 hours old?) or authorship (e.g., were the commits made by multiple different authors)? These might be good signals to look at if this rule turns out to have false negatives (i.e., if users sometimes push more than 7 commits to empty repositories and expect them to publish) or false positives (i.e., if users sometimes "import" existing repositories with 7 or fewer commits and are dissastisfied with them failing to publish), but they're more complicated to examine and the actual thing we care about may really be the number of commits, since this only really creates a problem if it spams people / feed / etc and hopefully no one will get too upset about a maximum of 7 notifications, even if the heuristic got the "wrong" result.