Get rid of static.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Mar 22 2017
address review feedback that I hadn't gotten to yet.
Note: I'm not sure why harbormaster is failing?
- Cleaned up the elastic query and added comments describing the purpose of the clauses
- a couple of bugfixes found by further testing
Ok I've reworked this quite a bit and I may have messed up somewhere in the process.
Mar 21 2017
In D17384#209987, @epriestley wrote:Just to make sure I haven't missed anything:
- We currently write health checks but never read them, right? So there's no effect (other than the UI "Status" changing) when a service fails health checks? That seems fine for now, I just want to make sure I didn't miss a health check read somewhere.
Mar 20 2017
Just to make sure I haven't missed anything:
Mar 17 2017
I'm going to put D17497 + D17498 into this release even though they don't directly tackle the issue here and I think they're slightly risky changes (mostly because git ls-remote may have odd behaviors in some cases, and we don't currently use it in other workflows). But they may help with T12296 and general cluster load issues, and the followup changes will generally be more complicated to reason about (more locking/concurrency stuff), so I think getting these in earlier spreads risk out somewhat even though they're something to watch out for in this release.
Mar 16 2017
- Move the stats definitions into the engine so the status UI remains engine agnostic.
- Fix a bug where role => false was being treated like role => true in the UI
I'm pleased to report that this has been live on wikimedia's phabricator for about a week without any incidents whatsoever. Additionally, we are in the process of migrating from elasticsearch 2.x to 5.x and the ability to write to multiple clusters is really working out nicely for transition.
Mar 15 2017
Probably bump the version unconditionally now
Mar 14 2017
(For example, if the goal was to very aggressively optimize for minimizing network traffic, we could read the entire repository history to figure out which refs were ancestors of other refs first? Then we could drop those and only ask for descendant refs. But this seems crazy, since it's saying that ~20 bytes of network traffic is more costly than like one hundred million disk I/O operations?)
it has to compare "What do I have" vs. "What do you have"
Do git ls-remote (which, curiously, seems to be significantly faster than git fetch even when git fetch is a no-op).
I think T12393 and this probably have a similar set of root causes.
Mar 13 2017
Mar 10 2017
- Added index stas to status ui
- Separate mysql status from elasticsearch status and show different set of columns appropriate to each cluster type.
Mar 9 2017
- Remove unused healthrecord stuff from PhabricatorSearchCluster class
- Add back getDisplayName to the PhabricatorSearchCluster class because it's needed.
Addressed latest round of feedback.
Few more minor things.
Mar 8 2017
$limit = 10000 - $offset
- updated PhabricatorExtraConfigSetupCheck
- 10,000 results
- remove unused methods
This is now live on https://phab-01.wmflabs.org for testing. Everything seems to be working well, including the health monitoring.
In D17384#209367, @epriestley wrote:This is shaping up nicely, couple of other minor inlines.
This is shaping up nicely, couple of other minor inlines.
Mar 7 2017
Fix unit test case.
Getting closer...
Mar 4 2017
In D17384#208882, @epriestley wrote:
- Using the same objects as both Host and Service feels confusing to me. I think this would probably be clearer as separate Service and Host classes? Like PhabricatorMySQLSearchClusterService extends PhabricatorSearchClusterService and PhabricatorMySQLSearchClusterHost extends PhabricatorSearchClusterHost or similar. Particularly because setHostRefs() seems like it's getting called with a raw dictionary in one case and a list of objects in another? And then there's weird magic around getHostRefs() for the MySQL case?
I'll split out the changes to the engine if I can figure out how to do that... Update coming soon.
Mar 3 2017
General stuff:
@epriestley: Ok I believe this addresses all of your feedback and other than documentation it should be very close to finished.
Address epriestley's feedback about tooltip and string concatenation
- Fixed up the cli workflows for search init and search index
- Misc other cleanup
Fix elasticsearch setup checks
Feb 20 2017
Dec 13 2016
Dec 5 2016
This has been running cleanly in production for roughly two weeks, and appears stable. secure001 stopped writing Files data at F1943597 and we're now at F2078289 on secure003. We saw a couple of minor setup issues (mostly: exception messages not being tailored enough) but no fundamental issues.
Nov 23 2016
Nov 22 2016
This seems to be working now. I'm going to let it sit in production for a while and see if any issues crop up before considering it resolved, but it seems like everything is working smoothly.
I configured 003 to replicate to 004:
Okay, we're headed back into readonly mode shortly to set up replication. I'm going to verify D16916 along the way so there may be some "partitions disagree about life" errors.
Here's what I've done so far:
I'm partitioning secure.phabricator.com now. Things will drop into read-only mode for a bit.
- T11908 is a followup for executing queries for multiple applications on a single connection. I believe the pathway for that is straightforward and fairly short, but that no install would really reap substantial benefits from it today, so I don't expect to pursue it for some time.
- I believe everything else is now complete, so I'll put this in production as soon as everything here lands and we can see what catches on fire.
Nov 21 2016
We have at least one tricky issue remaining: when applying storage upgrades, we currently apply them like this:
Nov 19 2016
General state of the world here:
Nov 16 2016
Nov 13 2016
Nov 12 2016
Nov 8 2016
It's just very
important to
thoroughly explore
the applications
Sep 29 2016
Sep 27 2016
This went out a while ago and I confirmed the fix in production.
Sep 23 2016
Sep 21 2016
I've banged on this a reasonable amount locally without issues and the originating instance reports that this seems to have calmed things down in production with these patches, so it seems like this pretty much just worked.
Sep 20 2016
Sep 8 2016
From reviewing the code, it appears that this should be handled correctly already.
Sep 5 2016
Sep 2 2016
I believe this should be fixed in HEAD of master. It should promote to stable in about 24 hours.