I write the iOS client for Voxer. We are the top social networking/communication app for iOS and Android (well, these things fluctuate, but we’re always in the top 5). We help people communicate with friends, and family, and associates; individually, or in group chats. People like to communicate in lots of different ways so we allow you to share using voice, pictures, or text. Our application is available in 50+ countries so we try to do all of this in as internationally-friendy a way as possible. For text that means using Unicode as our encoding standard.
One of the great things about human communication is that we keep finding so many new ways to do it. One medium is emoji—little picture characters you can embed in your text. The pictures include everything from smiley faces, to buildings, to piles of poo; and they really do help liven up casual communication.
One of the great things about Unicode is that it has lots of room to grow, and the Unicode consortium regularly adds new things to the standard. As part of Unicode 6.0 they added code points for emoji. This was great because previously there were multiple, incompatible, encodings for emoji characters. Making it standard was great.
One of the grew things about Apple is that when they see a standard that makes sense, they adopt it early, and decisively. When iOS 5 was release back in 2011 it included full support for Unicode 6.0 and started using the new standard emoji code points rather than the old incompatible non-standard code points.
It was at this point that the pile of poo hit the fan—neither of those emoji would show up in our app any more. While the rest of the technological world had gotten on the Unicode bandwagon (sometimes kicking and screaming), the JavaScript world had not. The JavaScript standard still mandated an internal representation of text that could not express all of Unicode. In fact, it could only encode a little more than 65,000 characters in what is known as the Basic Multilingual Plane. The technology on which our servers are based use JavaScript, and as such, mangles any non-BMP characters because it has no way to represent them. Grrr.
[ Before you start mocking node.js or JavaScript for its lack of full Unicode support, please note that I really wanted to insert an emoji frowny face at this part of the narrative—but I can't. You see, MySQL, the DB which backs this blog, didn't become fully Unicode compliant until version 5.5 which came out in December of 2010. My provider is using a pre-5.5 version. ]
So anyway, earlier this year I wrote a really detailed email at work explaining the issue to all of our folks who, it turns out, are not big text geeks like I am. The email spelled out the history of Unicode, described how text encodings worked in layman’s terms, and then gave details about the shortcomings of our server technology in this area. Well, by a amazing piece of luck, our CTO ran into Brendan Eich at a conference. Brendan is the guy who invented JavaScript. Lots of talking happened, emails were exchanged, tweets happened, danders were raised, and bugs were filed, etc. It seems that the JavaScript community had been aware of this problem for quite a while, but had never really had much reason to care because, until recently, there were almost no characters outside of the BMP.
As of Unicode 6 there are a lot. There will be a lot more. Pretty much every new character created for Chinese, Korean, Japanese, and lots of other scripts, will go into the extra-BMP space. Plus, you know, you gotta be able to send smiley faces and piles of poo.
I’m happy to say that JavaScript is changing. The technical committee that manages the evolution of ECMAScript, the standard on which JavaScript is based, has said that full Unicode support will be part of ES 6. V8, the JavaScript engine on which node.js is built has already updated, and version 0.7 of node.js will include this shiny new V8 and soon all of our users will be able to send each other emoji birthday cakes through Voxer again.
If you ever think that you’re just a small voice in a big industry, or that big giant international technology consortia don’t care about the needs of users, well, you’re still mostly right. But sometimes, just sometimes, everything works out the way that it should, and smart people work really hard to fix big problems on short schedules, and it makes the world a better place for all of us.
Apathy gets you nowhere.













