Syntax highlighting is a waste of an information channel

No, not a waste in general . Syntax highlighting is quite useful. I’m saying it’s a waste of an information channel . Here’s a quick demonstration of what I mean. Here’s 399 squares and one circle. Where’s the circle?

jy67VfM.png!web

Round two. Where’s the circle?

UBBraaU.png!web

Color carries a huge amount of information. Color draws our attention. Color distinguishes things. And we just use it to distinguish syntax.

Nothing wrong with distinguishing syntax. It’s the “just” that bothers me. Highlighting syntax is not always the most important thing to us. The information we want from code depends on what we’re trying to do. I’m interesting in different things if I’m writing greenfield code vs optimizing code vs debugging code vs doing a code review. I should be able to swap different highlighting rules in and out depending on what I need. I should be able to combine different rules into task-level overlays that I can toggle on and off.

I’ve listed some examples of what we could do with this. If this is something that already exists I included a link. Otherwise I included a mockup. Some of the examples have implementation issues beyond what I discussed; they’re just demonstrations of what highlighting could be. All examples are Pythonish unless otherwise noted.

Some Use Cases

Rainbow parenthesis

This is a pretty common one. We can use different colors to mark how nested a set of parenthesis are. From here .

UfIn2uz.png!web

Context Highlighting

Highlight different levels of nesting. From here .

JnqUva2.png!web

Import highlighting

Highlight identifiers imported from a different file.

2a2IVfI.png!web Variations:

Highlight imported functions and classes differently
Highlight qualified imports
Highlight imports from particular trees

Argument Highlighting

Arguments passed into the function are highlighted differently from local variables or global identifiers.

NFJJ7jA.png!web Variations:

Carry it through to aliases (if we assign the argument to another value, highlight that too)
Highlight local variables only
Highlight values that will be assigned to something
Highlight variables used in loops

Type Highlighting

Highlight all list variables and integer variables with different colors.

bIfuqmr.png!web

Variations:

Highlight all iterables
Highlight all functions returning option types
Highlight all variables that could be one of two types
Highlight all polymorphic types parameterized to integers

Exception Highlighting

Highlight functions that raise errors not caught in their body.

ueeyUfQ.png!web

Variations:

Highlight all functions with try blocks
Highlight functions that raise user-defined exceptions
Highlight functions that raise a specific exception
Highlight functions that catch a specific exception

Metadata Highlighting

Highlight functions that were directly called in the bodies of tests that failed in the last test run.

JnIjIvf.png!web

Highlight functions without precondition decorators
Highlight functions that are part of a certain stacktrace
Highlight functions which are defined in our branch but not the master branch

Random other ideas I didn’t mock up

# TODO

Issues

Why aren’t things this way? There’s both essential and coincidental challenges that make fully leveraging color a lot harder than just having syntax highlighting.

First is actually implementing rules. Some of these require access to the code’s AST, some require broader knowledge of the project, some require runtime information. Some of the ideas are even infeasible; accurately tracking aliasing is an open problem for most languages. Syntax highlighting, by contrast, is usually a matter of regexes and hierarchical state machines. That’s how pygments does it. Semantic highlighting would have to be made from scratch for each language.

Second is highlighting conflicts. What if something needs to be colored two things for two different reasons? In syntax highlighting this is less of a problem because you have an ordered list of matchers. But with semantic highlighting we might have dynamic priorities, where rule A is more important to us now while rule B is more important to us later. Things get even more complicated if we have multiple distinct overlays, which themselves can have priority conflicts. Semantic highlighting would need a much more complex design and implementation than simple syntax highlighting does, and adding overlays makes it even more complicated.

Finally, existing editors just aren’t well set up to handle this. Vim’s syntax highlighting is a mess of regular expressions and special cases. VSCode and (I believe) Atom use TextMate grammars , which assume a single canonical tokenization per file. VSCode recently added semantic highlighting but it seems more oriented to augment the existing syntax highlighting, not radically rethink it. I have no idea what Emacs does.

So I think this is something we’ll eventually have, because the potential advantages are too great to ignore forever. But it will take us a long time to get there. Maybe we’ll see it first with toy languages where the AST is simple enough and the expressiveness is low enough to make semantic highlighting easy.

Update for the influx of new readers

This was a newsletter post, you can subscribehere

Some Use Cases

Rainbow parenthesis

Context Highlighting

Import highlighting

Argument Highlighting

Type Highlighting

Exception Highlighting

Metadata Highlighting

Random other ideas I didn’t mock up

Issues

Update for the influx of new readers

Recommend

First steps in WebGL

NLP入门 | 通俗讲解Subword Models

在Windows的Ubuntu子系统运行支持CUDA的深度学习代码

传iPhone 12将全系支持双模5G网络 2021年推出单模版

阅文集团上半年盈利预警！受新丽传媒拖累或由盈转亏

45亿砸向暑期大战，在线教育巨头打的不止是价格战 | 深网

罗永浩抖音直播带货100天，糊了？_创事记_新浪科技_新浪网

美团打车平台抽成上涨司机要求集体撤离

A股有史以来最大IPO？蚂蚁集团宣布上市募资或超100亿美元

蚂蚁集团2000亿美元估值，它撑得起吗？

About Joyk