Page MenuHomePhabricator

Add an Emoji Typeahead

Authored by chad on Jan 24 2017, 5:50 PM.
Referenced Files
Unknown Object (File)
Sun, May 28, 5:30 PM
Unknown Object (File)
Sun, May 28, 6:10 AM
Unknown Object (File)
Sat, May 27, 5:54 PM
Unknown Object (File)
Sun, May 21, 9:21 AM
Unknown Object (File)
Sun, May 21, 12:06 AM
Unknown Object (File)
Sat, May 20, 9:44 PM
Unknown Object (File)
Fri, May 19, 2:16 AM
Unknown Object (File)
Thu, May 18, 5:40 PM
"100" token, awarded by avivey."Party Time" token, awarded by talshiri.



This adds a more complete emoji datasource, with a typeahead and autocomplete. It works by pulling in a raw datasource from EmojiOne (I chose Unicode 8, but they have a Unicode 9 datasource as well) and transforming it for speed/need. If we build more robustness or an actual picker into the Remarkup bar, having the additional keywords, etc, might be important. When Unicode 9 support is more prevalent, we should only need to update the single file.

Tossing up as a proof of concept on engineering direction. Also I can't quite get the autocomplete to complete.

Test Plan

Test UIExamples, Autocomplete, and TypeaheadSource

Diff Detail

rP Phabricator
Lint Not Applicable
Tests Not Applicable

Event Timeline


FWIW, this works fine since `NOTE: ' typically has a space after it.

Everything here works except Autocomplete pasting in the selection. I think there is more JS needed that I'm not quite able to figure out.


  • Are you planning to insert :boar: or πŸ—? Can you walk me through the decision in more detail?
  • You can make this work with setAutocomplete($whichever_one) on the PhabricatorTypeaheadResult. Either $emoji or $shortname seem to work for me locally.


  • How long does phutil_json_decode() on the manifest.json take?
  • Doesn't the typeahead need all the extra data, at least in a future diff?

Files on disk:

  • This adds two copies of the manifest, one in /emoji/ and one in /webroot/. I think we only need one?
  • If emoji_strategy.json is from EmojiOne, it should be in externals/emojione/ with a LICENSE / README (all third-party code/stuff should go in externals/).
    • The EmojiOne license requires attribution, which I think we should honor in spirit if we use data from the project: (that is, not just like a link in a README in externals/).
  • We probably don't need a new top-level /emoji/ directory. The generated manifest should go in resources/emoji/ or something, maybe?
  • Can we beautify the JSON in emoji_strategy.json, so future diffs are more readable?


  • Data seems OK for proof-of-concept but maybe we could find a better source?
  • The poop emoji lists "sol" as a keyword twice ("shit out of luck"?), as do "four leaf clover" and "x" (unclear why?).
  • No normal human-language names (like "Reversed Hand with Middle Finger Extended and Medium-Light Skin" instead of "reversed_hand_with_middle_finger_extended_tone2").
  • Aliases list doesn't seem great, e.g. "meow" and "kitten" hit nothing. These hit 🐱on iOS.
    • Not sure we want to ship "butt" for πŸ‘ or "penis" as a keyword for 🍌 and πŸ†, which this list contains.
    • "vagina" hits 5 emoji (tulip, taco, ...), while "pussy" doesn't hit 🐱.
    • I'm surprised their GitHub project doesn't have 30 issues about how critical and sexist and/or not-sexist this yet.
    • Generally, this list feels "okay" to me but not necessarily great.

Probably just get rid of this, or leave a more detailed note if you want?

Are you planning to insert πŸ— or πŸ—? Can you walk me through the decision in more detail?

I just want shortcuts for :bo to :boar, not insert direct emoji. I think that's what I'd expect as a user. Or do you have a preference?

I pulled the data from v2.2.0 of EmojiOne, I can't find another source of this data, but I can also just remove EmojiOne completely and use the generated manifest from it. The most up to date EmojiOne files are Unicode 9, and no browser I tested supported Unicode 9 yet, so those were all blank boxes. If we were using image support, then we'd be in the clear there.

Twitter is the other Open Source project for Emoji, but I can't find a shortcut list anywhere browsing their repository. I'm not sure, where did you get our initial list from? Ideally I just wanted a source, like font-awesome, I could pull and update twice a year with ease.

  • remove double emoji manifests
  • remove emoji_strategy

Yeah in giving a little thought, I don't think we'll need keywords or alias, so I removed all of that outright. Sample search from Slack:

pasted_file (368Γ—390 px, 24 KB)

I have a pretty strong preference for inserting the actual emoji, since I think inserting an English-language ASCII character codes creates a lot of complicated side effects:

  • Inserting the actual emoji is what I generally expect, or at least not a surprise -- that's what the emoji pickers in consumer software I'm familiar with (like iOS,, Facebook, and Twitter) do.
  • Inserting the actual emoji gives us a clean story on internationalization: we don't have to deal with :boar: vs :el-boaro: down the road. Only the typeahead picker needs to learn Spanish, which is easy since only one user is looking at it.
  • We aren't stuck with needing to keep :mountain_bicyclist_tone5: working forever. Although :cat: is sort of not awful, :mountain_bicyclist_tone5: feels super unwieldy to me, especially if you speak Chinese.
    • EmojiOne is interested in feedback on these ( which implies that they're reasonably likely to make shortname changes in the future.
    • Modifiers are problematic in particular since ZWJ sequences are quite complex (
      • I believe sequences like <WOMAN> <DARK SKIN TONE> <ZWJ> <HEART> <ZWJ> <KISS> <ZWJ> <WOMAN> <MEDIUM DARK SKIN TONE> are valid.
      • EmojiOne has kiss_mm and kiss_ww (but, oddly, no kiss_mw, or kiss_wm), and no kiss_woman_tone_5_woman_tone_4.
      • I don't think you can have "Light-Skinned Female Judge Kissing Dark-Skinned Hacker Cat" because the Unicode Consortium are right-wing sexist/racist fascists.
  • We don't have to worry about conflicts like :x:.:a:, etc. In particular, :a: is a valid component of an IPv6 IP address. I would guess that, in Phabricator, :a: is almost always some kind of technical text and almost never "Negative Squared Latin Capital Letter A / Blood Type A".
  • Inserting the actual emoji works with search indexing, commit messages, webhooks, API calls, etc., without building new types of remarkup rendering modes.

In contrast, inserting the actual emoji pretty much avoids all this stuff, I think, and all the downsides (like junk emoji on desktop Linux) are likely to get better in the future (with better emoji support from OSes/browsers/devices).

emoji pickers in consumer software

Some of these don't really insert actual real emoji, but they insert a picture of a boar in some form, not text that says "a picture of a boar".

Does dropping in real Emoji(tm) prevent us from transforming them via StickerPacks(c) in the future (going through Remarkup rules)?

I couldn't find json_decode on the xhprof profile, but the page seems quick?

pasted_file (2Γ—2 px, 418 KB)

Or I guess it's quickest if I just build it into a static PHP file?

Changed to insert emoji directly and it feels pretty nice. I might have to switch to Safari though to have better emoji

  • Switch to inserting Emoji over text

I feel I took the long path on a 25 line diff

I think the issue with sticker pack emoji (specifically, with using images instead of system emoji) is that you have three options for what you insert while the user is writing or editing text:

  • Insert a real emoji, but then it looks different from the emoji users will actually see when they send their message, which is probably confusing, especially if you're using image emoji over system emoji because many of your users don't have emoji support (it's 2014, or all your users use Gentoo, or whatever else).
  • Insert a picture, but you can't put a picture in a normal <textarea> -- the input needs to be a <div> with contentEditable (like Twitter and Facebook) which creates 95 million billion problems. So maybe you don't want to do this if you aren't Twitter or Facebook and/or otherwise don't have a Very Good Reason to switch to contentEditable.
  • Insert a text code that says "a picture of a cat", which is probably a little worse for users than a picture but not as confusing as two different pictures (especially if system emoji are garbage) and doesn't require all the JS magic to make contentEditable less buggy.

If we switch to image emoji later, we'll have this same problem, and might want to change the behavior to start inserting the text "a picture of a cat". However, inserting real emoji now doesn't prevent us from making this choice later, and I suspect we'd possibly want to store the cat emoji anyway even if we did switch, and just do a transform before giving it to the user for editing. Maybe.

Couple of minor technical things.


We have to keep these intact, as-is, because the two lists are different. For example, if we make this change (removing this map), :+1: will break in all existing text.

This map basically has to be here forever since we shipped it, unless we're OK breaking a bunch of stuff or want to try to do a migration to replace all the :+1: in stuff in the database with the real emoji.

But you can just safely get rid of this now since the remarkup rule and the emoji typeahead no longer interact at all.

1 β†—(On Diff #41474)

Let's put this in resources/emoji/ or something instead of webroot/? It doesn't need to be served to the browser.

This revision now requires changes to proceed.Jan 24 2017, 7:32 PM

(The existing :code: list isn't really "emoji support", it's "compatibility with Slack and GitHub markup for users familiar with those systems". The suggester is strictly better, but making the GitHub :code: emoji work is useful if you're copy/pasting from GitHub or Slack or whatever.)

I think just add that array to my generator and unique the output?

If we do, :mountain_bicyclist_tone5: (the literal text) will start working, and we'll be stuck with it forever (or have to break users when we change stuff). I think :mountain_bicyclist_tone5: should not work (since it does not work on GitHub or Slack, and is independently silly).

I also wouldn't necessarily mind breaking/removing PhabricatorEmojiRemarkupRule since I think working in a familiar way is more important than actually working exactly like GitHub/Slack (it's probably rare that anyone is copy/pasting text from them?), in favor of the autocomplete and saying "you can install it as an extension if you had a million :+1:s", but if we did want to do this we should give the suggester some time to stabilize first. Presumably once the suggester ships almost no one will be typing new :cat: codes anyway.

I was considering allowing you to extend :partyparrot: as an inline macro.

Yeah, if we actually want to do stickers -- as distinct from emoji -- I think our options for editing are just the other two things:

  • Insert some kind of text, like :sticker:.
  • Make everything contentEditable, although it still needs to boil down to :sticker: internally because of stuff like the API, plain text email, etc.

I don't think there's any technical reason not do to that, but maybe the best way forward would be:

  • Pick something other than :cat:, like, uh, &cat&? That's awful but I'm not sure if we can get away with (cat) without tons of annoyance. Maybe if it's just a remarkup button, not an autocomplete? Since I'm going to trigger an autocomplete on ( 50 million times every day when typing normal text. Or maybe [cat] or {cat} would be alright.
  • Then it doesn't conflict with emoji and you can autocomplete for it separately, maybe?
  • Seems kind of weird to me to have macros and stickers as two different things, since you could already do stickers by creating macros for them, except that they won't render inline. But inline macros/stickers probably look awful? Like, if you type "It's (partyparrot) time" that line is going to end up with a weird line-height and look awful, I think? Do other sticker systems let you put things inline?
  • If stickers can't be inline, we can just extend Macro to support stickers, using the opposite of the strategy for Tokens discussed in T11217 (tokens would be going from code-only to code + data, macros would be going from data-only to code + data).

And then we get into the whole "can you award emoji, macros, tokens, and stickers to a comment?" thing.

Maybe we do this:

  • Macro gets the reverse-tokens treatment and becomes code + data, so we can have data-defined macro packs, including sticker packs (but they could also include "best ship-it-quick gifs" or whatever -- they're just a collection of valid macros, not necessarily all small ones).
  • Macros and stickers share a namespace, but data-defined packs have a prefix. Like a fox sticker pack might define macros called fox/sleep, fox/happy, whatever. Normal macros can't, so they won't conflict.
  • You can autocomplete macros and stickers with $ (for "$ticker"?), but it just inserts "newline, sticker {{{ shipitquick }}}, newline" or something when you autocomplete it.
  • Some day there's a fancier mouse-driven tabbed selector dialog thing, too.

Then emoticons (like ":)") don't convert into stickers, but they don't convert in Messenger either, and would look bad inline (I think?) so this seems fine?

what does twitch do? Slack lets you intermix stickers / emoji

Well I guess Slack they are "custom emoji", that's all I was thinking of.

chad edited edge metadata.
chad marked an inline comment as done.
  • move file, add back old emoji

sorry I was shopping at the grocery store you know like for food

This revision is now accepted and ready to land.Jan 24 2017, 9:11 PM
This revision was automatically updated to reflect the committed changes.