Page MenuHomePhabricator

Daemon skipping commit
Closed, InvalidPublic

Description

I just updated my installation, bouncing all the daemons and aphlict as part of the process.

When it came back up, a daemon noticed that it somehow had missed an old commit, so it pulled it and inserted it into the feed, triggered herald, etc.

We noticed because we got an email for an old commit. I went back and looked at the feed and that commit was indeed missing earlier.

This is an SVN repo if that's relevant. I have a dump from directly before the update if that's useful.

Event Timeline

sshannin raised the priority of this task from to Needs Triage.
sshannin updated the task description. (Show Details)
sshannin added a project: Daemons.
sshannin added a subscriber: sshannin.

Looking at the daemon logs, it looks like there were various mysql failures throughout the day (too many cnxns), but it seems like it was able to recover from those.

What's the main issue here?

Nothing here sound wrong, but maybe I'm missing some details. It doesn't sound easily reproducible unfortunately either... off the top of my head.

Sorry, there was a lot of background/setup story there.

The main issue is that a commit got dropped and there was no indication until I happened to bounce the whole stack 4 hours later.

I think the core issue is if your daemons are failing (MySQL issues?) we should somehow tell you. The picking up where they left off when kicked is what I expect them to do already. Maybe @epriestley has better ideas than me though.

Yeah, that's the main aspect of it.

Also though, the fact that the old commit got skipped (but newer ones made it in) seemed a bit off as well.

epriestley claimed this task.

There isn't enough information to reproduce this issue.

We don't inform administrators about individual daemon failures because they are common, usually not actionable, and sometimes very frequent. The daemon console and demon logs report status.

Commits are processed in arbitrary order to improve performance and allow a repository to remain functional if there's an issue processing one commit. With in-order processing, the repository would be frozen by a single failure.

Understood. I have the daemon logs and db dumps if you think they would be helpful.

I don't think that the out-of-order is a problem. I do think that a 4hr delayed commit sans warning seems a bit suspect though (not a high traffic repo). It's not clear to me that the commit would ever have been picked up if I hadn't bounced the daemons for other reasons.