Ligatures in Programming Fonts: Hell No
source link: https://www.tuicool.com/articles/hit/Z7vqIbi
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Ligatures in programming fonts—a misguided trend I was hoping would collapse under its own illogic. But it persists. Let me save you some time—
Ligatures in programming fonts are a terrible idea.
And not because I’m a purist or a grump. (Some days, but not today.) Programming code has special semantic considerations. Ligatures in programming fonts are likely to either misrepresent the meaning of the code, or cause miscues among readers. So in the end, even if they’re cute, the risk of error isn’t worth it.
First, what are ligatures?Ligatures are special characters in a font that combine two (or more) troublesome characters into one. For instance, in serifed text faces, the lowercase f often collides with the lowercase i and l . To fix this, the fi and fl are often combined into a single shape (what pros would call a glyph ).
f i f j f l f f i g g g y ok f i f j f l f f i g g g y wrong fi fj fl ffi gg gy rightIn this type designer’s opinion, a good ligature doesn’t draw attention to itself: it simply resolves whatever collision would’ve happened. Ideally, you don’t even notice it’s there. Conversely, this is why I loathe the Th ligature that is the default in many Adobe fonts: it resolves nothing, and always draws attention to itself.
Ligatures in programming fonts follow a similar idea. But instead of fixing the odd troublesome combination, well-intentioned amateur ligaturists are adding dozens of new & strange ligatures. For instance, these come from Fira Code, a heavily ligatured spinoff of the open-source Fira Mono.
So what’s the problem with programming ligatures?
-
They contradict Unicode. Unicode is a standardized system—used by all contemporary fonts—that identifies each character uniquely. This way, software programs don’t have to worry that things like the fi ligature might be stashed in some special place in the font. Instead, Unicode designates a unique name and number for each character, known as a code point . If you have an fi ligature in your font, you identify it with its designated Unicode code point, which is
0xFB01
.In addition to alphabetic characters, Unicode assigns code points to hundreds of symbols. Many of the programming ligatures shown above are visually similar to existing Unicode symbols. So in a source file that uses Unicode characters, how would you know if you’re looking at a => ligature that’s shaped like ⇒ vs. Unicode character
0x21D2
, which also looks like ⇒? The ligature introduces an ambiguity that wasn’t there before. -
They’re guaranteed to be wrong sometimes. There are a lot of ways for a given sequence of characters, like “ =>”, to end up in a source file. Depending on context, it doesn’t always mean the same thing.
The problem is that ligature substitution is “ dumb” in the sense that it only considers whether certain characters appear in a certain order. It’s not aware of the semantic context. Therefore, any global ligature substitution is guaranteed to be semantically wrong part of the time.
When we’re using a serifed text font in ordinarybody text, we don’t have the same considerations. An fi ligature always means f followed by i . In that case, ligature substitution that ignores context doesn’t change the meaning.
Still, some typographic transformations in body text can be semantically wrong. For instance,foot and inch marks are often typed with the same characters as quotation marks. (See straight and curly quotes .) But whereas quotation marks want to be curly, foot and inch marks want to be straight (or slanted slightly to the upper right). So if we apply automatic smart (aka curly) quotes, we have to be careful not to capture foot and inch marks in the transformation.
Does that mean programmers can never have nice things? It’s totally fine to redesign individual characters to distinguish them from others. For instance, inTriplicate, I include a special “ Code” variant that includes redesigned versions of certain characters that are easily confused.
`$te_fl{1234*567~890}
Regular
`$te_fl{1234*567~890}
Code
But in this case, the point is disambiguation: we don’t want the lowercase l to look like the digit 1 , nor the zero to look like a cap O . Whereas ligatures are going the opposite direction: making distinct characters appear to be others.
Bottom line: this isn’t a matter of taste. In programming code, every character in the file has a special semantic role to play. Therefore, any kind of “ prettifying” that makes one character look like another—including ligatures—leads to a swamp of despair. If you don’t believe me, try it for 10 or 15 years.
—Matthew Butterick
29 March 2019
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK