Shady Characters – The secret life of punctuation

Face with Tears of Joy, revealed!

Cover for "Face with Tears of Joy", showing the face with tears of joy emoji throughout emoji history — Cover of Face with Tears of Joy. (W. W. Norton, 2025.)

April 11, 2025

Please allow me to introduce the cover of my next book, Face with Tears of Joy: A Natural History of Emoji!

‘😂’ will be published in July this year, and you can pre-order the paperback in the USA from Norton, Amazon.com, Barnes & Noble, Books A Million, Bookshop.org, Hudson, Powell’s, or Walmart. In the rest of the world, order from Amazon.co.uk, Bookshop.org or Waterstones.

If e-books are more your style, you can pre-order for your Nook at Barnes & Noble; for your Kobo at Kobo.com; for your iPhone or iPad at Apple Books; or for your Kindle at Amazon (USA) or Amazon (UK).

I’ve said it before, but I really enojyed writing this one. It’s a sequel, almost, to Shady Characters, focusing as it does on an intriguing collection of symbols which dance around the written word without ever really becoming part of it. Emoji carry echoes of cuneiform, kanji, manga, and even the interrobang, and if any of those subjects pique your interest — or even just emoji themselves! — then please consider pre-ordering a copy.*

*

On a side note, I never used to understand why my agent, Laurie Abkemeier, always encouraged me to encourage you to pre-order my books. Isn’t a pre-order just the same as a order placed after publication? Well, it turns out not. Pre-orders make the publishing world go round: they inform print run sizes, they help bookshops make sure they have the right stock, and they can influence bestseller charts. (Okay, that last one is more a “me” thing than a “you” thing.) They can even make sure that you will actually be able to read the book, since pre-orders are fulfilled before normal orders.

All this is to say, if you have even a passing interest in emoji, please pre-order the book! It would mean a lot to me, personally, and my editor will thank you too. ↢

Miscellany № 108: emoji, for all the wrong reasons

April 4, 2025

Emoji have been in the news recently for a host of reasons, most of them bad — but all of them, I would submit, worthy of our attention.

First up is a Netflix series called Adolescence that has been garnering plaudits in the UK and elsewhere since it went on air last month. I won’t spoil the plot, but I note that Emojipedia has joined the clamour with a blog post titled “Netflix’s ‘Adolescence’, Emoji Codes & Emoji Repurposing”. In it, Keith Broni explores the programme’s use of an “emoji code” by so-called incels — a misogynistic online culture of men who blame women for their lack of sexual success — in which, for instance, ‘💯’ refers to the belief that 20% of men attract 80% of women. Adolescence moots other codes too, such as ‘🔵’ and ‘🔴’ to represent the blue and red pills made famous by The Matrix, the Wachowski sisters’ dystopian action film, and which in turn relate to being “blue pilled” or “red pilled” — either accepting of mainstream views or “seeing through” them to some alleged hidden truth.

Despite the mild moral panic that has accompanied this use of emoji, Broni makes the point (echoed in Face with Tears of Joy) that in-groups often create temporary codes, or vocabularies, which take everyday words or symbols and give them new meanings. Emoji happen to be particularly ripe for this sort of jargon because they can have multiple interpretations based on the object or action being depicted, the name of that object or action, or metaphorical interpretations of either one.

I haven’t watched Adolescence yet (and to be honest, it does not seem like it will be an easy watch), but when I do I’ll be interested to see how this creative yet dispiriting use of emoji is handled.

Emoji only work because of the efforts of the Unicode Consortium, the body which governs text on the web and beyond. One of Unicode’s responsibilities is to define the technical mechanisms by which computerised text is encoded — that is, turned into bits and bytes — for storage or transmission. This has inevitably given rise to many special rules that address one nuance or another of written language.

Emoji have a number of such nuances. One in particular is the use of what are called “variation selectors”, which are invisible characters that can be used to choose between monochrome and colour versions of a given emoji. (For example, ‘😄︎’ versus ‘😄’.) There are lots of variation selectors in Unicode, but most are unused and computer programs generally ignore them unless they can do something useful with them.

So far, so innocuous. However, a software engineer named Paul Butler has found that it is possible to “smuggle” hidden meanings in emoji using long strings of unused variation selectors. Such hidden messages are not necessarily bad for one’s computer (and as Paul discovers, you don’t even need emoji to create them), but his blog post on the subject makes fascinating reading for anyone even tangentially interested in how computers process text.

A week is a long time in politics, as they say, and this week has been especially long. If you can, though, cast your mind back to the scandal that erupted barely a fortnight ago in which US Secretary of Defense, Pete Hegseth, shared what appeared to be extremely sensitive military secrets in a group chat to which the editor of The Atlantic magazine had been unwittingly added.

If this wasn’t enough to be grappling with, it emerged that Hegseth’s correspondents cheered his news of attacks on Houthi insurgents with strings of emoji: Michael Waltz, an advisor to the Trump administration, posted “👊🇺🇸🔥”, while others expressed their gratitude with “prayer hands”, or ‘🙏’.

Honestly, I can’t even. I like emoji — I wrote a book about them! — yet even I can see that using them to celebrate a series of airstrikes is beyond the pale. It is emoji’s world now; we just live in it.

Miscellany № 107: old books (and a new Book)

Paperback cover of "The Book", showing the title on a cream band against a dark green background. — The Book paperback. (W. W. Norton, 2025.)

March 29, 2025

Some big news today: The Book: A Cover-to-Cover Exploration of the Most Powerful Object of our Time will be published soon in paperback!

The paperback edition is updated from the hardback, with all of the various errata fixed and a number of changes made to account for the book’s paperback rather than hardcover construction.* If you remember, the hardback edition had a number of references to its own physical form (“Tip the book towards you and look at the spine”, and so on) which have now been modified accordingly. Of course, the paperback edition provides a new and different set of reference points, so if some passages now ask you to find a hardback on your bookshelf, others can now tell you to look again at the paperback in your hands.

The paperback is currently available to pre-order from all the usual places.
Read more →

*: I may have spooked the printer just a little by hunting down their contact info and emailing to ask for details of the paperback’s construction. Their email to W. W. Norton with the requisite information had a slightly hunted air about it. Leave us alone, rogue author! ↢

Backing the backslash

March 14, 2025

I’m almost ashamed to say it, but I never really gave the backslash a second thought.

The backslash’s forward-leaning counterpart is everywhere, especially in computing. It lives in network and web addresses such as https://shadycharacters.co.uk; in file paths, such as /home/keith; and it introduces human-readable “comments” in any number of programming languages, often like /* this */ or // this.

But the backslash? It’s a rarer species. Windows users might recognise it from the command prompt, where it occurs in file paths such as C:\Users\Keith (although Windows does also recognise Unix-style forward slashes). The backslash is also sometimes used to “escape”, or neutralise, characters that would otherwise have some special meaning in a given context. For example, if an ampersand has some special meaning in a particular type of file, prefixing it with a backslash (\&) will often cause it to be treated as a normal ampersand instead. (Ironically, in writing that last sentence I had to escape the backslash character itself, \\, since it has a special meaning in the software that runs shadycharacters.co.uk.)

It was with interest, then, that I read a post on Mastodon* from a user named Modulux on the subject of the backslash:

Friend of mine was commenting that the origins of the backslash character are unclear. It was included in the teletype character set and IBM put it on ASCII for that reason but it is not known what it was for. Searching around I found some unreliable info that it was used in typography from the 16th century in order to represent line or paragraph breaks. Can anyone confirm or disconfirm, preferably with source?

To set one thing straight, the backslash wasn’t historically used for line or paragraph breaks. Instead, that was one of the functions of the forward slash, which, once upon a time, had an important role in punctuating texts: in the medieval and early modern periods, a forward slash could be used to indicate a pause of a lesser or greater length, anywhere between a comma and a paragraph depending on its user. It was often called a virgula or virgule, and it gave rise to the modern comma. It is covered along with many other such marks in the Shady Characters book.

Virgules in the prologue to William Caxton's edition of Virgil's Eneydos — Virgules in the prologue to William Caxton’s edition of Virgil’s Eneydos. (CC-BY 2.0 image courtesy of the University of Glasgow.)

Historical wobbles aside, on the question of where the backslash, or ‘\’, had come from, a lively discussion ensued. Stewart Russell responded:

According to Mackenzie (“Coded Character Sets, History and Development”) it seems to have appeared in the IBM Stretch design proposal, sometime in the mid 1950s (even though the first Stretch machine was delivered to Los Alamos in 1961) – see the chapter “Early Codes”, around page 80. The first name given to it was “Reverse Divide”.

It was also used in the text representation of ALGOL, with /\ for “∧” (AND) and \/ for “∨” (OR).

I had picked up a copy of Charles Mackenzie’s Coded-character Sets: History and Development during the writing of Shady Characters (never say writing isn’t a glamorous occupation), so I looked up Stewart’s reference. And indeed, the modern backslash does seem to have appeared first in IBM’s so-called Stretch character set,1 as used in an eponymous computer delivered to Los Alamos National Laboratory† in 1961.2

I talk more about character sets in my upcoming book, Face with Tears of Joy, but the long and short of it is that character sets are lexicons of individual characters known to one computer or another. To communicate successfully, any two computers must agree on which character set to use. Nowadays, that is a very simple negotiation, since effectively all computers use the all-encompassing Unicode character set. Back in the ’50s, however, things were different — electronic computers were still new, and many of them boasted their own specialised character sets. IBM’s Stretch computer was a case in point.

Why, though, had IBM chosen to incorporate a backslash in this new character set? Even before the Stretch computer had been delivered to Los Alamos, an influential IBMer named Bob Bemer3 had written to the Association of Computing Machinery, the industry body for America’s computer manufacturers and computer scientists, to make the case for developing a standard character set.4 His proposal kickstarted a drive towards just such a standard, and eventually resulted in the creation of ASCII — the American Standard Code for Information Interchange, a forerunner of Unicode, and whose name you may well have heard before.

Yet Bemer also had his day job at IBM, and one of his tasks there was to figure out which characters should be available on the as-yet unfinished Stretch computer.5 Enter the backslash — or rather, enter the “reverse divide” character, which is what Bemer had called it in his 1959 paper for the ACM.4

Before his death in 2004, Bemer wrote an short article entitled “How ASCII Got its Backslash” on, well, how ASCII got its backslash, and it does not have a lot to do with division, reverse or otherwise. Bemer recalls that he wanted to be able to write certain logical expressions in a programming language called Algol.6 At the time, Algol was still very new, though it had already been anointed by the ACM as its standard language for computer algorithms. Among other things, Algol allowed programmers to use binary logic, in which values could be combined and manipulated using operations called AND, OR and NOT. The first of those two operations were typically written using A- and V-shaped characters (‘∧’ and ‘∨’ respectively), and Algol adopted that same convention.7

The problem, as Bemer saw it, was that extant character sets had no way to render those crucial Boolean operations. Being all too aware of the limited memory available to early computers, Bemer realised that with the addition of a single new character — ‘\’ — he could write Algol’s AND and OR operators by combining it with the ‘/’ already present in many other character sets: /\, \/.6 Thus the backslash appeared, first for use by programs running on the Stretch computer and later in ASCII itself.

As to the name of the ‘\’; well, it’s all a little unsatisfying. In his earliest published writing on the subject, that 1959 article for the ACM, Bemer calls the backslash the “reverse divide”.4 His reasoning is obvious enough — ‘\’ is the mirror image of ‘/’, which is commonly used to indicate mathematical division — although he never spelled it out as such. “Reverse slant” is another early favourite, again for fairly obvious reasons, cropping up in a contemporary account of the IBM Stretch project8 and going on to eclipse “reverse divide” in popularity. The existence of two different names would suggest that ‘\’ had no commonly-held name at the time, and that a new one had had to be invented.

Yet neither name would last very long. Soon after Bemer minted the ‘\’, some anonymous programmer decided that “reverse slant” and “reverse divide” were too wordy, or too obscure, and came up with a pithier name for it. Born in the ’50s as the “reverse divide”, in the ’60s the ‘\’ was reinvented as the “backslash”‡ (who wasn’t reinventing themselves in that decade?), and the name has stuck ever since.

Except, except! That is not the whole story. The backslash, it turn out, is a few decades older than it seems — or rather, it has a doppelganger from an earlier technological era. There’s an intriguing story behind that one too, but it will have to wait for another time!

1.: MacKenzie, Charles E. “Early Codes : The Stretch Code”. In Coded Character Sets: History and Development, 67-75. Addison-Wesley Pub (Sd), 1980.

↢
2.: CHM Revolution. “IBM Stretches Its Capabilities”. Accessed March 13, 2025.

↢
3.: Hardison, Erica. “Robert W. Bemer”. IEEE Computer Society.

↢
4.: Bemer, R. W. “A Proposal for a Generalized Card Code for 256 Characters”. Commun. ACM 2, no. 9 (September 1, 1959): 19–23. https://doi.org/10.1145/368424.368435.

↢
5.: Bemer, Bob. A Story of ASCII.

↢
6.: Bemer, Bob. “How ASCII Got Its Backslash”.

↢
7.: Perlis, A. J., and K. Samelson. “Report on the Algorithmic Language ALGOL the ACM Committee on Programming Languages and the GAMM Committee on Programming”. Numerische Mathematik 1, no. 1 (December 1959): 41-60. https://doi.org/10.1007/BF01386372.

↢
8.: Ballance, Richard S., and Werner Buchholz. “Chapter 6: Character Set”. In Planning a computer system; Project Stretch. New York: McGraw-Hill, 1962.

↢

*: This is a convenient simplification: Modulux posted on a personal server compatible with Mastodon but not actually running Mastodon itself — so although I first read Modulux’s post on Mastodon, it did not originate there. This is the strength of the so-called fediverse, where anyone can run a social network and connect it, or “federate”, to other networks. It’s a bit like email — you can use one of the big providers such as Gmail or Apple iCloud, or you can run your own personal mail server. It is a brave new world, and a welcome one, too. Shady Characters also lives there, at mastodon.social/@shadychars. ↢
†: Nuclear energy once again! This is turning into a theme. ↢
‡: Google Ngrams shows 1967 as the inflection point. ↢

Miscellany № 106: pistols, punctuation and print

February 28, 2025

I came across a post last July on Emojipedia, in which Keith Broni* noted that Twitter, or X, had redesigned its PISTOL emoji. PISTOL had always been controversial: most online platforms started off with PISTOLs drawn as realistic firearms, but, over the course of the mid-2010s, most of them moved to toylike renderings of water pistols instead. Twitter had followed suit.

Last year, though, as Broni reported, Twitter’s water pistol was redrawn to show a realistic, if simplified, handgun. There doesn’t seem to have been a great deal of fanfare from Twitter itself, or any formal justification for why the change was made. Which is, perhaps, not a surprise, since Elon Musk fired Twitter’s PR team soon after he became CEO in 2022. There’s no-one left to announce anything.

Ordinarily, I would have dismissed this news as relatively small beer. Emoji are redesigned all the time, even if usually not in quite such a radical fashion. Moreover, Elon Musk has always been a contrarian, and his increasingly hard-right rhetoric† often seemed calculated more to enrage than to persuade — and was, as such, easy to ignore.

Now, though, as I witness the political turmoil gripping the USA, and I bear in mind that Elon Musk is one of its fiercest cheerleaders, I wonder if that small but significant change to a single emoji is more important than I once thought. When it’s published this July, Face With Tears of Joy will talk about how emoji can change appearances and meanings — and indeed, it takes the PISTOL as a case study — but Twitter’s mean-spirited update to ‘🔫’ is almost painfully relevant. The renewed presence of a realistic pistol emoji on an ailing but still relevant social network seems in hindsight to have been a harbinger of something much worse coming down the line.

In an attempt to lighten the mood, may I present a short extract from Yuyn Li’s short story, “Apostrophe’s Dream”, part of a collection called A Cage Went in Search of a Bird:

colon: Ah, yes, friends, let me remind you all: we have been discussing the pressing issue of relevance. We are, unlike letters and numbers, increasingly facing a fate of being misused, abused, and worse, rejected as being superfluous.

comma: Tell that to my Oxford cousin—he’s the most unflappable creature but that doesn’t help in his case. Half—no, more than half—of the world don’t even know of his existence these days.

Yes: this is indeed a conversation between punctuation marks, and I am not spoiling it too greatly if I tell you to expect the ellipsis, exclamation mark, period and other marks to weigh in too. Read the full story at The Dial!

Lastly, podcast fans should listen to this recent episode of Grammar Girl, entitled “Cancellation”. Mignon Fogarty, the host, explains why the American spellings of “cancellation” and “canceled” use a double and single l respectively, which is interesting enough in itself, but do hang around for the second part of the show, in which Mignon reads a short essay by Glenn Fleishman on the origins of the term “fine print”. It’s a fact-filled summary of more than four centuries of printing history.

Enjoy!

*: From one Keith to another, 🤜🤛. ↢
†: Rhetoric which has caused me and many others to shut down our Twitter accounts either temporarily or permanently. If you’d like to follow Shady Chracters on social media, please see the alternative links in the colophon. ↢