76

Ligatures in Programming Fonts: Hell No

 5 years ago
source link: https://www.tuicool.com/articles/hit/Z7vqIbi
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Lig­a­tures in pro­gram­ming fonts—a mis­guided trend I was hop­ing would col­lapse un­der its own il­logic. But it per­sists. Let me save you some time—

Lig­a­tures in pro­gram­ming fonts are a ter­ri­ble idea.

And not be­cause I’m a purist or a grump. (Some days, but not to­day.) Pro­gram­ming code has spe­cial se­man­tic con­sid­er­a­tions. Lig­a­tures in pro­gram­ming fonts are likely to ei­ther mis­rep­re­sent the mean­ing of the code, or cause mis­cues among read­ers. So in the end, even if they’re cute, the risk of er­ror isn’t worth it.

First, what are lig­a­tures?Lig­a­tures are spe­cial char­ac­ters in a font that com­bine two (or more) trou­ble­some char­ac­ters into one. For in­stance, in ser­ifed text faces, the low­er­case f of­ten col­lides with the low­er­case i and l . To fix this, the fi and fl are of­ten com­bined into a sin­gle shape (what pros would call a glyph ).

f i f j f l   f f i g g g y ok f i f j f l   f f i g g g y wrong fi fj fl  ffi gg gy right

In this type de­signer’s opin­ion, a good lig­a­ture doesn’t draw at­ten­tion to it­self: it sim­ply re­solves what­ever col­li­sion would’ve hap­pened. Ide­ally, you don’t even no­tice it’s there. Con­versely, this is why I loathe the Th lig­a­ture that is the de­fault in many Adobe fonts: it re­solves noth­ing, and al­ways draws at­ten­tion to itself.

Lig­a­tures in pro­gram­ming fonts fol­low a sim­i­lar idea. But in­stead of fix­ing the odd trou­ble­some com­bi­na­tion, well-in­ten­tioned am­a­teur lig­a­tur­ists are adding dozens of new & strange lig­a­tures. For in­stance, these come from Fira Code, a heav­ily lig­a­tured spin­off of the open-source Fira  Mono.

bYFVVfr.png!web

So what’s the prob­lem with pro­gram­ming ligatures?

  1. They con­tra­dict Uni­code. Uni­code is a stan­dard­ized sys­tem—used by all con­tem­po­rary fonts—that iden­ti­fies each char­ac­ter uniquely. This way, soft­ware pro­grams don’t have to worry that things like the fi lig­a­ture might be stashed in some spe­cial place in the font. In­stead, Uni­code des­ig­nates a unique name and num­ber for each char­ac­ter, known as a code point . If you have an fi lig­a­ture in your font, you iden­tify it with its des­ig­nated Uni­code code point, which is 0xFB01 .

    In ad­di­tion to al­pha­betic char­ac­ters, Uni­code as­signs code points to hun­dreds of sym­bols. Many of the pro­gram­ming lig­a­tures shown above are vi­su­ally sim­i­lar to ex­ist­ing Uni­code sym­bols. So in a source file that uses Uni­code char­ac­ters, how would you know if you’re look­ing at a => lig­a­ture that’s shaped like ⇒ vs. Uni­code char­ac­ter 0x21D2 , which also looks like ⇒? The lig­a­ture in­tro­duces an am­bi­gu­ity that wasn’t there before.

  2. They’re guar­an­teed to be wrong some­times. There are a lot of ways for a given se­quence of char­ac­ters, like “ =>”, to end up in a source file. De­pend­ing on con­text, it doesn’t al­ways mean the same thing.

    The prob­lem is that lig­a­ture sub­sti­tu­tion is “ dumb” in the sense that it only con­sid­ers whether cer­tain char­ac­ters ap­pear in a cer­tain or­der. It’s not aware of the se­man­tic con­text. There­fore, any global lig­a­ture sub­sti­tu­tion is guar­an­teed to be se­man­ti­cally wrong part of the  time.

When we’re us­ing a ser­ifed text font in or­di­narybody text, we don’t have the same con­sid­er­a­tions. An fi lig­a­ture al­ways means f fol­lowed by i . In that case, lig­a­ture sub­sti­tu­tion that ig­nores con­text doesn’t change the meaning.

Still, some ty­po­graphic trans­for­ma­tions in body text can be se­man­ti­cally wrong. For in­stance,foot and inch marks are of­ten typed with the same char­ac­ters as quo­ta­tion marks. (See straight and curly quotes .) But whereas quo­ta­tion marks want to be curly, foot and inch marks want to be straight (or slanted slightly to the up­per right). So if we ap­ply au­to­matic smart (aka curly) quotes, we have to be care­ful not to cap­ture foot and inch marks in the transformation.

Does that mean pro­gram­mers can never have nice things? It’s to­tally fine to re­design in­di­vid­ual char­ac­ters to dis­tin­guish them from oth­ers. For in­stance, inTrip­li­cate, I in­clude a spe­cial “ Code” vari­ant that in­cludes re­designed ver­sions of cer­tain char­ac­ters that are eas­ily confused.

`$te_fl{1234*567~890} Reg­u­lar
`$te_fl{1234*567~890} Code

But in this case, the point is dis­am­bigua­tion: we don’t want the low­er­case l to look like the digit 1 , nor the zero to look like a cap O . Whereas lig­a­tures are go­ing the op­po­site di­rec­tion: mak­ing dis­tinct char­ac­ters ap­pear to be others.

Bot­tom line: this isn’t a mat­ter of taste. In pro­gram­ming code, every char­ac­ter in the file has a spe­cial se­man­tic role to play. There­fore, any kind of “ pret­ti­fy­ing” that makes one char­ac­ter look like an­other—in­clud­ing lig­a­tures—leads to a swamp of de­spair. If you don’t be­lieve me, try it for 10 or 15 years.

—Matthew But­t­er­ick
29 March  2019


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK