20after4 (Mukunda Modell)
Release Engineer @ Wikimedia Foundation

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Nov 28 2011, 9:35 AM (278 w, 2 d)
Availability
Available

Recent Activity

Yesterday

20after4 committed rP654f0f6043f8: Make messages translatable and more sensible. (authored by 20after4).
Make messages translatable and more sensible.
Tue, Mar 28, 11:17 PM
20after4 closed D17578: Make messages translatable and more sensible. by committing rP654f0f6043f8: Make messages translatable and more sensible..
Tue, Mar 28, 11:17 PM
20after4 created D17578: Make messages translatable and more sensible..
Tue, Mar 28, 11:14 PM
20after4 accepted rP8879118b696f: Fix a mid-air collision around SearchService roles.
Tue, Mar 28, 10:13 PM
20after4 added a task to D17575: Provide some guidance about elasticsearch in cluster docs: T12450: New Search Configuration Errata.
Tue, Mar 28, 10:07 PM
20after4 added a revision to T12450: New Search Configuration Errata: D17575: Provide some guidance about elasticsearch in cluster docs.
Tue, Mar 28, 10:07 PM · Search
20after4 created D17575: Provide some guidance about elasticsearch in cluster docs.
Tue, Mar 28, 10:06 PM
20after4 accepted D17574: Re-run config validation from `bin/search`.
Tue, Mar 28, 9:46 PM
20after4 committed rP699228c73b74: Address some New Search Configuration Errata (authored by 20after4).
Address some New Search Configuration Errata
Tue, Mar 28, 8:19 PM
20after4 added a commit to T12450: New Search Configuration Errata: rP699228c73b74: Address some New Search Configuration Errata.
Tue, Mar 28, 8:19 PM · Search
20after4 closed D17564: Address some New Search Configuration Errata by committing rP699228c73b74: Address some New Search Configuration Errata.
Tue, Mar 28, 8:19 PM
20after4 added a comment to D17564: Address some New Search Configuration Errata.

Just as a general workflow suggestion, I'd encourage you to do this as a bunch of small changes instead of one big "fix everything" change

Tue, Mar 28, 8:16 PM
20after4 added a comment to D17572: Make `bin/search init` messaging a little more consistent.

Seems legit.

Tue, Mar 28, 8:00 PM
20after4 accepted D17572: Make `bin/search init` messaging a little more consistent.
Tue, Mar 28, 8:00 PM
20after4 accepted D17573: Remove PhabricatorSearchEngineTestCase.
Tue, Mar 28, 7:59 PM
20after4 accepted D17571: Fix isReadable() and isWritable() in SearchService.
Tue, Mar 28, 7:58 PM
20after4 committed rP9e2f263bb49c: Add repositories to fulltext search index. (authored by 20after4).
Add repositories to fulltext search index.
Tue, Mar 28, 7:58 AM
20after4 closed D17300: Add repositories to fulltext search index. by committing rP9e2f263bb49c: Add repositories to fulltext search index..
Tue, Mar 28, 7:58 AM · Diffusion, Search
20after4 updated the diff for D17300: Add repositories to fulltext search index..

push to staging for harbormaster

Tue, Mar 28, 7:56 AM · Diffusion, Search
20after4 updated the diff for D17300: Add repositories to fulltext search index..

fix '\n'

Tue, Mar 28, 7:55 AM · Diffusion, Search
20after4 updated the summary of D17564: Address some New Search Configuration Errata.
Tue, Mar 28, 7:41 AM
20after4 updated the diff for D17564: Address some New Search Configuration Errata.

Better formatting of setup warning messages.

Tue, Mar 28, 7:41 AM
20after4 requested review of D17564: Address some New Search Configuration Errata.

this fixes the stemmer and tokenizer to do a better job of matching words.separated.by.punctuation as well as other issues found by @epriestley.

Tue, Mar 28, 1:42 AM
20after4 updated the summary of D17564: Address some New Search Configuration Errata.
Tue, Mar 28, 1:40 AM
20after4 edited the description of T12450: New Search Configuration Errata.
Tue, Mar 28, 1:38 AM · Search
20after4 added a comment to T12450: New Search Configuration Errata.

I've updated D17564: Address some New Search Configuration Errata to address the tokenization and word stemming issues.

Tue, Mar 28, 1:35 AM · Search
20after4 updated the diff for D17564: Address some New Search Configuration Errata.
  • Fixed the stemmer. user matches users and vise-versa.
  • Added a different tokenizer so that this.is.a.test tokenizes to the following:
  • this.is.a.test
    • this
    • is
    • a
    • test
Tue, Mar 28, 1:34 AM

Mon, Mar 27

20after4 updated the diff for D17564: Address some New Search Configuration Errata.

trying once more...

Mon, Mar 27, 2:49 PM
20after4 updated the diff for D17564: Address some New Search Configuration Errata.

Try to make harbormaster happy by setting repository.callsign globally in ~/.arcrc

Mon, Mar 27, 2:48 PM
20after4 created D17564: Address some New Search Configuration Errata.
Mon, Mar 27, 2:41 PM
20after4 added a revision to T12450: New Search Configuration Errata: D17564: Address some New Search Configuration Errata.
Mon, Mar 27, 2:41 PM · Search
20after4 added a comment to T10640: Allow application queries to be promoted as global search modes.

Maniphest advanced search is somewhat buried, indeed. I think one easy solution to this would be to add "Task search" to the main phab menu (using the new custom menus feature)... In fact, I think I will do that now at https://phabricator.wikimedia.org

Mon, Mar 27, 10:21 AM · Search, Feature Request
20after4 awarded D17563: Cleaner fullscreen / preview states for Remarkup bar a Love token.
Mon, Mar 27, 10:14 AM
20after4 added a comment to T12450: New Search Configuration Errata.

  • Searching for f*a*c*t*o*r*y*s*u*r*p*l*u*s*z*z*q*q*z*z*q*q produces nonsenical results (many results, when I would expect no results: the results do not contain that sequence of letters in order).
  • Searching or user fails to find task Grant users tokens when a mention is created, suggesting that stemming is not working.
  • Searching for users finds that task, but fails to find a task containing "per user per month" in a comment, also suggesting that stemming is not working.
  • Searching for maniphest fails to find task maniphest.query elephant, suggesting that tokenization is ElasticSearch is not as good as the MySQL tokenization for these words (see D17330).
Mon, Mar 27, 9:45 AM · Search
20after4 added a comment to T12450: New Search Configuration Errata.

f*a*c*t*o*r*y*s*u*r*p*l*u*s*z*z*q*q*z*z*q*q returns the same results as
f a c t o r y s u r p l u s z z q q z z q q so it appears to be treating those as individual single-letter tokens. strange.

Mon, Mar 27, 9:37 AM · Search
20after4 added a comment to T12443: Applying fulltext limits first causes missing results.

I think it would make a lot of sense to construct the two queries separately (and in parallel) with a short timeout, then handle the timeout gracefully allowing the user to refine their query further. This would avoid the denial of service situation which happened to Wikimedia more than once due to users repeatedly executing really expensive searches until mysql fell over from the load.

Mon, Mar 27, 8:43 AM · Restricted Project, Search, Bug Report

Sun, Mar 26

20after4 added a comment to T12450: New Search Configuration Errata.

I ran into a lot of confusion because the versioned object indexes are not namespaced per-service. Basically, if you insert version 95 of a document into Elastic, the indexer thinks that version 95 doesn't need to go into MySQL, even though it does. So when you run bin/search index ..., you may get only a subset of the updates you actually need. The object index versions need to change to become engine-aware so they are stored per-service, not globally, and/or the whole mechanism needs to include a hash of cluster.search or just be turned off. Until this is fixed, it can be worked around with using --force everywhere.

bin/search index might reasonably provide summary output about this ("392 documents were not indexed because they haven't changed, use --force to update them.").

Sun, Mar 26, 11:08 PM · Search
20after4 added a comment to T12450: New Search Configuration Errata.

  • Searching for f*a*c*t*o*r*y*s*u*r*p*l*u*s*z*z*q*q*z*z*q*q produces nonsenical results (many results, when I would expect no results: the results do not contain that sequence of letters in order).
  • Searching or user fails to find task Grant users tokens when a mention is created, suggesting that stemming is not working.
  • Searching for users finds that task, but fails to find a task containing "per user per month" in a comment, also suggesting that stemming is not working.
  • Searching for maniphest fails to find task maniphest.query elephant, suggesting that tokenization is ElasticSearch is not as good as the MySQL tokenization for these words (see D17330).
  • Searching for users -blue returns a huge number of results: significantly more than users. Expected behavior: fewer results, omitting those results matching blue.
  • Searching for users blue returns more results than users or blue. Expected behavior: fewer results, because only results which match "users" AND "blue" are returned. The result set includes completely irrelevant results.
Sun, Mar 26, 10:59 PM · Search
20after4 added a comment to T12450: New Search Configuration Errata.

@epriestley: Thanks for the detailed feedback... I'll get to work ;)

Sun, Mar 26, 10:56 PM · Search
20after4 added a comment to T12450: New Search Configuration Errata.
  • Has T8602 been resolved?

I can not reproduce it on wikimedia's install.

Sun, Mar 26, 12:42 PM · Search
20after4 added a comment to T12450: New Search Configuration Errata.
  • Write an "Upgrading: ..." guidance task with narrow instructions for installs that are upgrading.

TODO

  • Do we need to add an indexing activity (T11932) for installs with ElasticSearch?

Yes, I think so

  • We should more clearly detail exactly which versions of ElasticSearch are supported (for example, is ElasticSearch <2 no longer supported)? From >T9893 it seems like we may only have supported ElasticSearch <2 before, so are the two regions of support totally nonoverlapping and all ElasticSearch users will need to upgrade?
Sun, Mar 26, 12:31 PM · Search
20after4 added a comment to T12450: New Search Configuration Errata.

I haven't been testing with elasticsearch < 2.0 so this might break backwards compatibility. It wouldn't be difficult to fix any compatibility issues though, with a tiny bit of testing.

Sun, Mar 26, 12:27 PM · Search
20after4 added a comment to T6552: Implement partial / wildcard searching (Elasticsearch).

With the elasticsearch 'simple_query_string' query parser it only works if you use *pricot, for example, outside of quoted phrases.

Sun, Mar 26, 12:23 PM · Elasticsearch, Search
20after4 added inline comments to D17300: Add repositories to fulltext search index..
Sun, Mar 26, 12:22 PM · Diffusion, Search
20after4 added a comment to T5282: Provide documentation on setting up ElasticSearch.

Note there will finally be a little bit of documentation once this install rebuilds diviner docs: The url should be https://secure.phabricator.com/book/phabricator/article/cluster_search/ (eventually)

Sun, Mar 26, 12:13 PM · Elasticsearch, Documentation
20after4 closed T6552: Implement partial / wildcard searching (Elasticsearch) as "Resolved".

This should work just fine with the index mapping and query generation in rPe41c25de5050: Support multiple fulltext search clusters with 'cluster.search' config

Sun, Mar 26, 12:09 PM · Elasticsearch, Search
20after4 added a commit to T6552: Implement partial / wildcard searching (Elasticsearch): rPe41c25de5050: Support multiple fulltext search clusters with 'cluster.search' config.
Sun, Mar 26, 12:08 PM · Elasticsearch, Search
20after4 added a task to rPe41c25de5050: Support multiple fulltext search clusters with 'cluster.search' config: T6552: Implement partial / wildcard searching (Elasticsearch).
Sun, Mar 26, 12:08 PM
20after4 closed T9779: ./bin/search init error with elasticsearch 2.0, a subtask of T9893: Support ElasticSearch 2.0 - 5.1, as "Resolved".
Sun, Mar 26, 12:06 PM · Elasticsearch, Search
20after4 closed T9779: ./bin/search init error with elasticsearch 2.0 as "Resolved".
Sun, Mar 26, 12:06 PM · Elasticsearch
20after4 added a commit to T9779: ./bin/search init error with elasticsearch 2.0: rPe41c25de5050: Support multiple fulltext search clusters with 'cluster.search' config.
Sun, Mar 26, 12:06 PM · Elasticsearch
20after4 added a task to rPe41c25de5050: Support multiple fulltext search clusters with 'cluster.search' config: T9779: ./bin/search init error with elasticsearch 2.0.
Sun, Mar 26, 12:06 PM
20after4 added a revision to T9893: Support ElasticSearch 2.0 - 5.1: D17384: Support multiple fulltext search clusters with 'cluster.search' config.
Sun, Mar 26, 12:05 PM · Elasticsearch, Search
20after4 added a task to D17384: Support multiple fulltext search clusters with 'cluster.search' config: T9893: Support ElasticSearch 2.0 - 5.1.
Sun, Mar 26, 12:05 PM · Wikimedia, Clusters, Elasticsearch
20after4 added a commit to T9893: Support ElasticSearch 2.0 - 5.1: rPe41c25de5050: Support multiple fulltext search clusters with 'cluster.search' config.
Sun, Mar 26, 12:04 PM · Elasticsearch, Search
20after4 added a task to rPe41c25de5050: Support multiple fulltext search clusters with 'cluster.search' config: T9893: Support ElasticSearch 2.0 - 5.1.
Sun, Mar 26, 12:04 PM
20after4 renamed T9893: Support ElasticSearch 2.0 - 5.1 from "Support ElasticSearch 2.0" to "Support ElasticSearch 2.0 - 5.1".
Sun, Mar 26, 12:04 PM · Elasticsearch, Search
20after4 closed D15843: Prevent sending a header with unsafe characters [\r\n\0].
Sun, Mar 26, 12:03 PM
20after4 added inline comments to D17300: Add repositories to fulltext search index..
Sun, Mar 26, 11:53 AM · Diffusion, Search
20after4 updated the diff for D17300: Add repositories to fulltext search index..

resubmit with arc diff --config repository.callsign=P

Sun, Mar 26, 8:44 AM · Diffusion, Search
20after4 updated the diff for D17300: Add repositories to fulltext search index..

Addressed epriestley's feedback.

Sun, Mar 26, 8:42 AM · Diffusion, Search
20after4 updated the test plan for D17300: Add repositories to fulltext search index..
Sun, Mar 26, 8:28 AM · Diffusion, Search
20after4 committed rPe41c25de5050: Support multiple fulltext search clusters with 'cluster.search' config (authored by 20after4).
Support multiple fulltext search clusters with 'cluster.search' config
Sun, Mar 26, 8:16 AM
20after4 closed D17384: Support multiple fulltext search clusters with 'cluster.search' config by committing rPe41c25de5050: Support multiple fulltext search clusters with 'cluster.search' config.
Sun, Mar 26, 8:16 AM · Wikimedia, Clusters, Elasticsearch
20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.

try to get harbormaster to build (push to staging?)

Sun, Mar 26, 8:13 AM · Wikimedia, Clusters, Elasticsearch

Sat, Mar 25

20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.
  • actually, acutally utilize the health monitoring...
Sat, Mar 25, 10:29 PM · Wikimedia, Clusters, Elasticsearch
20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.
  • Improved the status monitoring UI in config/cluster/search/
  • Actually utilize the health monitoring cache to avoid connecting to downed servers.
Sat, Mar 25, 10:27 PM · Wikimedia, Clusters, Elasticsearch

Thu, Mar 23

20after4 added a comment to D17384: Support multiple fulltext search clusters with 'cluster.search' config.

@epriestley sweet, I'll land this as soon as I see that you've merged to stable.

Thu, Mar 23, 9:25 PM · Wikimedia, Clusters, Elasticsearch
20after4 added a comment to T12438: Project tokenizer functions like "any()" and "not()" do not include descendants.

I can confirm that In Any: does not seem to include subprojects. I tried to make some sense of the way the project search functions work but it's pretty complicated.

Thu, Mar 23, 4:54 PM · Projects, Maniphest, Search, Feature Request
20after4 added a comment to D17384: Support multiple fulltext search clusters with 'cluster.search' config.

@epriestley: I think this is ready to land but I want to give you one more chance to change your mind.

Thu, Mar 23, 3:33 PM · Wikimedia, Clusters, Elasticsearch
20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.
  • Created diviner documentation: Cluster: Search
  • removed stray phlog
Thu, Mar 23, 3:29 PM · Wikimedia, Clusters, Elasticsearch
20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.
  • Fix searching relationships which I had inadvertantly broken.
  • Better elasticsearch 2.x and 5.x support
  • more optimized query
Thu, Mar 23, 12:58 PM · Wikimedia, Clusters, Elasticsearch
20after4 added a comment to T12441: Implement "Phabricator Stories".

If there's any good content in this feature at all, why do I never see it reposted to Reddit or Facebook or Twitter? Are Reddit and Twitter just for old people now?

Thu, Mar 23, 3:33 AM
20after4 added a comment to T12003: Explain to users how fulltext queries are parsed and executed.

Elasticsearch has much better support for non-latin language analysis. See https://www.elastic.co/guide/en/elasticsearch/guide/current/icu-tokenizer.html discusses their ability to properly tokenize Thai, Chinese and Japanese text.

Thu, Mar 23, 3:10 AM · Search
20after4 awarded rP9b92e56dfc03: Don't link "Dxxx" on Differential revision pages a Love token.
Thu, Mar 23, 2:57 AM

Wed, Mar 22

20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.

Fix method signature un-final PhabricatorElasticFulltextStorageEngine

Wed, Mar 22, 11:59 PM · Wikimedia, Clusters, Elasticsearch
20after4 added inline comments to D17384: Support multiple fulltext search clusters with 'cluster.search' config.
Wed, Mar 22, 7:13 AM · Wikimedia, Clusters, Elasticsearch
20after4 abandoned D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5.
Wed, Mar 22, 7:08 AM
20after4 added inline comments to D17384: Support multiple fulltext search clusters with 'cluster.search' config.
Wed, Mar 22, 7:07 AM · Wikimedia, Clusters, Elasticsearch
20after4 removed a dependency for D17384: Support multiple fulltext search clusters with 'cluster.search' config: D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5.
Wed, Mar 22, 7:03 AM · Wikimedia, Clusters, Elasticsearch
20after4 removed a dependent revision for D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5: D17384: Support multiple fulltext search clusters with 'cluster.search' config.
Wed, Mar 22, 7:03 AM
20after4 added a comment to D17384: Support multiple fulltext search clusters with 'cluster.search' config.

Ok I think I've eliminated the problematic parts like indexing project slugs.

Wed, Mar 22, 7:03 AM · Wikimedia, Clusters, Elasticsearch
20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.

Get rid of static.

Wed, Mar 22, 7:00 AM · Wikimedia, Clusters, Elasticsearch
20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.

address review feedback that I hadn't gotten to yet.

Wed, Mar 22, 6:53 AM · Wikimedia, Clusters, Elasticsearch
20after4 added a comment to D17384: Support multiple fulltext search clusters with 'cluster.search' config.

Note: I'm not sure why harbormaster is failing?

Wed, Mar 22, 6:44 AM · Wikimedia, Clusters, Elasticsearch
20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.
  • Cleaned up the elastic query and added comments describing the purpose of the

clauses

  • a couple of bugfixes found by further testing
Wed, Mar 22, 6:43 AM · Wikimedia, Clusters, Elasticsearch
20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.

Ok I've reworked this quite a bit and I may have messed up somewhere in the process.

Wed, Mar 22, 5:27 AM · Wikimedia, Clusters, Elasticsearch

Tue, Mar 21

20after4 added a comment to T12296: Improve Phacility repository import performance.

While the git ls-remote change isn't really motivated as a performance improvement, it does seem to have reduced CPU usage a measurable amount (deployed on the morning of 3/18), maybe 15%:

Tue, Mar 21, 2:19 PM · Diffusion, Ops, Phacility
20after4 added a comment to D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5.

So I've done a bit more thinking about how to implement the changes to the engine class, especially with regards to any bits that are not wanted in the upstream but are desirable for wikimedia's implementation.

Tue, Mar 21, 2:11 PM
20after4 added a comment to D17384: Support multiple fulltext search clusters with 'cluster.search' config.

Just to make sure I haven't missed anything:

  • We currently write health checks but never read them, right? So there's no effect (other than the UI "Status" changing) when a service fails health checks? That seems fine for now, I just want to make sure I didn't miss a health check read somewhere.
Tue, Mar 21, 2:03 PM · Wikimedia, Clusters, Elasticsearch

Mon, Mar 20

20after4 added a comment to D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5.

Does this require a full bin/search index for installs using Elastic? It looks like the index structure changes...

Mon, Mar 20, 5:10 PM
20after4 added inline comments to D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5.
Mon, Mar 20, 4:29 PM
20after4 added a comment to T9530: Release Server / Workflow app / Future of Releeph .

@avivey: I'm somewhat interested in this if you have any tips for getting it working locally I would like to try it out and see if I can contribute anything towards a finished extension.

Mon, Mar 20, 12:32 PM · Restricted Project, Harbormaster, Releeph
20after4 updated the summary of D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5.
Mon, Mar 20, 12:29 PM
20after4 added inline comments to D17384: Support multiple fulltext search clusters with 'cluster.search' config.
Mon, Mar 20, 12:26 PM · Wikimedia, Clusters, Elasticsearch
20after4 updated the diff for D17384: Support multiple fulltext search clusters with 'cluster.search' config.

rebased on top of D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5

Mon, Mar 20, 12:20 PM · Wikimedia, Clusters, Elasticsearch
20after4 added a dependency for D17384: Support multiple fulltext search clusters with 'cluster.search' config: D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5.
Mon, Mar 20, 12:10 PM · Wikimedia, Clusters, Elasticsearch
20after4 added a dependent revision for D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5: D17384: Support multiple fulltext search clusters with 'cluster.search' config.
Mon, Mar 20, 12:10 PM
20after4 created D17509: Updated PhabricatorElasticFulltextStorageEngine for elasticsearch 5.
Mon, Mar 20, 12:09 PM
20after4 abandoned D16304: Fall back to parent tasks / subtasks when the task graph is big.
Mon, Mar 20, 12:05 PM