We don't know how to reproduce this, so we can't move forward.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Apr 7 2017
Apr 6 2017
Apr 5 2017
Mar 30 2017
Mar 24 2017
Mar 23 2017
Mar 21 2017
Mar 20 2017
Mar 18 2017
The setsid() appears to have resolved the issue, in the sense that daemons are now starting cleanly every time without receiving any inexplicable signals.
The MySQL stuff seems to be in the old daemons (them trying to do logging while MySQL restarts), not the new daemons, and not the cause of the issue.
Mar 16 2017
Do this man
Mar 13 2017
Mar 8 2017
In our case these are commits from landed Differential diffs. Permissions
for the web user are there and correct, but I'm not sure I understand all
the details around the daemon vs. web user, and the daemon is doing the
work for this git commit parser.
mkdir: cannot create directory ‘/var/storage/3f/49’: Permission denied
I haven't been able to reproduce this locally by artificially making services unavailable or connections fail. Debugging this in production will require momentary service interruptions on admin until I can narrow things down so I'm going to wait until off-peak to poke at it.
Mar 7 2017
Occasionally, even after running the re-parse command, the basics of the commit will import but the content itself will not. A phd restart will almost immediately allow the commit to then import.
More info: the commit contained binary files (executables actually).
We now have one that's failing even when forcing a reparse:
Is there a way to recall the logs from a failing task? Is there any value in providing the logs for the successful task? Not really sure what to say here besides that it's a consistent and random issue on our repositories. When we run the trace, it then "bumps" the task somehow and things succeed, so it would be difficult to provide detail on an error.
There isn't enough information here for us to reproduce this, or even begin.
Mar 5 2017
In terms of application/code stability, it would be good to handle this properly.
Specifically, I suspect it won't affect repo because db restarts separately, so the database won't be unavailable when daemons on repo restart.
It looks like this might be an issue with MySQL not being available yet when the daemons start. The deployment script restarts MySQL, then immediately restarts the daemons, and a bunch of this stuff is ending up in the log in the 1-2 seconds after the restart:
Feb 28 2017
I've deployed everything so I believe this is now resolved. I'll keep an eye on it going forward.
The pool appears to be scaling up and down properly now. I'm going to deploy the fix to the secure and repo tiers since it may affect all instances.
I'm picking D17433 now and sending it to admin.
I'm picking those to stable and sending them to production now.
Feb 24 2017
Hibernating daemons currently show as "Waiting" in the Daemon console, but I'm not going to worry about that for now.
Feb 22 2017
No time like the present.
It is expected that taskmasters will exit and restart after an unexpected failure, but this should not trigger setup warnings. I'll see if I can reproduce this.
Feb 21 2017
I don't believe we've seen this pop up again. I can't be certain because right now all the log files are empty, but that means the errors went away?
T11708 has almost nothing to do with this, but the fix for this will rewrite the code that's running into issues and probably moot them.
I'm just going to merge this into T12115, which isn't really related, but will rewrite this code and probably "fix" this, since a reproduction case seems elusive.
This could probably be built with RRULEs now, but we don't currently have use cases / plans around a general-purpose cron-like tool.
Use cases seem to be fairly well covered now.
This is 2+ years old and autoscaling probably covers it now. T5401 is probably a more tangible attack on this.
This got a little more work when clustering was written but it's essentially a log of two-year-old ghost sightings at this point and not actionable.
I guess this one can live for now since that's reproducible/actionable.
Probably a dupe of T11708? I'm just going to kill this one since it's old as dirt.
I think it is likely that this was resolved by D17123. If it wasn't, we don't have a reproduction case anyway so we can't move forward. I'm going to call this one dead until more information turns up.
We were never able to reproduce this convincingly and as far as we know the reporting install no longer uses Phabricator, so I'm going to close this out.
I believe I have the first part of this (restructuring the code into a more sensible Overseer > Pool > Daemon sort of thing) working, but it could use more testing. I'm going to see if we have anything else in Daemons that I can fix while I'm here to help me kick the tires a bit.
Feb 20 2017
Feb 14 2017
closed per user request
@epriestley Please close this task (I can't).
Feb 2 2017
This is from a million years ago and I now think we should only respect commit messages. If "Fixes Txxx" works or ever worked in comments, I'd say we should actually undo that.
Jan 31 2017
@epriestley That's fair, I'll see what I can do.
Because of the complexity of building a reproduction case and high chance that this is a wild goose chase, we'll move forward with this after a community member confirms it reproduces for them. See T12134 for some discussion. See T12129 for a similar recent report which was a time-consuming wild goose chase.
Jan 30 2017
I upgraded last Friday to 2604c5af55f654d36f8db2f080b96486c4572216, so far this exception has not popped up again. I will check again later in the week.
Jan 17 2017
Jan 12 2017
I'm just going to merge this into T9640, it isn't meaningfully different from an implementation perspective and is the major compatibility issue.
I haven't explain correctly, what I mean is that this task is "Needs Triage" now but you can know how many work you will need to do (more or less) then you can set its priority (low, hight, whatever).
I can't think of a reason to prioritize this. Specifically, it seems like since Phabricator for most (all?) companies is a business critical piece of software, you'd always choose to run it on the most reliable / stable version of PHP. Is there a reason we should consider that not to be the case?
Now that PHP 7.1 is released you could think what will need to be changed and give a priority to this task.