Ligatures in Programming Fonts: Hell No

Ligatures in programming fonts—a misguided trend I was hoping would collapse under its own illogic. But it persists. Let me save you some time—

Ligatures in programming fonts are a terrible idea.

And not because I’m a purist or a grump. (Some days, but not today.) Programming code has special semantic considerations. Ligatures in programming fonts are likely to either misrepresent the meaning of the code, or cause miscues among readers. So in the end, even if they’re cute, the risk of error isn’t worth it.

First, what are ligatures?Ligatures are special characters in a font that combine two (or more) troublesome characters into one. For instance, in serifed text faces, the lowercase f often collides with the lowercase i and l . To fix this, the fi and fl are often combined into a single shape (what pros would call a glyph ).

f i f j f l f f i g g g y ok f i f j f l f f i g g g y wrong fi fj fl ffi gg gy right

In this type designer’s opinion, a good ligature doesn’t draw attention to itself: it simply resolves whatever collision would’ve happened. Ideally, you don’t even notice it’s there. Conversely, this is why I loathe the Th ligature that is the default in many Adobe fonts: it resolves nothing, and always draws attention to itself.

Ligatures in programming fonts follow a similar idea. But instead of fixing the odd troublesome combination, well-intentioned amateur ligaturists are adding dozens of new & strange ligatures. For instance, these come from Fira Code, a heavily ligatured spinoff of the open-source Fira Mono.

bYFVVfr.png!web

So what’s the problem with programming ligatures?

They contradict Unicode. Unicode is a standardized system—used by all contemporary fonts—that identifies each character uniquely. This way, software programs don’t have to worry that things like the fi ligature might be stashed in some special place in the font. Instead, Unicode designates a unique name and number for each character, known as a code point . If you have an fi ligature in your font, you identify it with its designated Unicode code point, which is 0xFB01 .

In addition to alphabetic characters, Unicode assigns code points to hundreds of symbols. Many of the programming ligatures shown above are visually similar to existing Unicode symbols. So in a source file that uses Unicode characters, how would you know if you’re looking at a => ligature that’s shaped like ⇒ vs. Unicode character 0x21D2 , which also looks like ⇒? The ligature introduces an ambiguity that wasn’t there before.
They’re guaranteed to be wrong sometimes. There are a lot of ways for a given sequence of characters, like “ =>”, to end up in a source file. Depending on context, it doesn’t always mean the same thing.

The problem is that ligature substitution is “ dumb” in the sense that it only considers whether certain characters appear in a certain order. It’s not aware of the semantic context. Therefore, any global ligature substitution is guaranteed to be semantically wrong part of the time.

When we’re using a serifed text font in ordinarybody text, we don’t have the same considerations. An fi ligature always means f followed by i . In that case, ligature substitution that ignores context doesn’t change the meaning.

Still, some typographic transformations in body text can be semantically wrong. For instance,foot and inch marks are often typed with the same characters as quotation marks. (See straight and curly quotes .) But whereas quotation marks want to be curly, foot and inch marks want to be straight (or slanted slightly to the upper right). So if we apply automatic smart (aka curly) quotes, we have to be careful not to capture foot and inch marks in the transformation.

Does that mean programmers can never have nice things? It’s totally fine to redesign individual characters to distinguish them from others. For instance, inTriplicate, I include a special “ Code” variant that includes redesigned versions of certain characters that are easily confused.

`$te_fl{1234*567~890} Regular `$te_fl{1234*567~890} Code

But in this case, the point is disambiguation: we don’t want the lowercase l to look like the digit 1 , nor the zero to look like a cap O . Whereas ligatures are going the opposite direction: making distinct characters appear to be others.

Bottom line: this isn’t a matter of taste. In programming code, every character in the file has a special semantic role to play. Therefore, any kind of “ prettifying” that makes one character look like another—including ligatures—leads to a swamp of despair. If you don’t believe me, try it for 10 or 15 years.

—Matthew Butterick
29 March 2019

Recommend

Partly Cloudy, Twitter Embarks on Their Cloud Journey

Going Underground: Graphing and Pathfinding London Tube Lines

react-keep - tiny library helping with vanishing data

The Zero Server Web Framework Allows Developers to Create Web Applications With...

天皇可以退休，但日本人不行

日本新天皇德仁其人

绝对值:MI 小米 MR424-A 厨下式反渗透RO净水器（400G通量） 1399元包邮，只限17点-18...

山东人到底有多会造吉他？

@老阿姨在看着你：我们家隔壁有一间特别大的平层，是家里男主人的书房，全部屋子里都...

五一陪床老人，最难中年人

About Joyk