20

Syntax highlighting is a waste of an information channel

 3 years ago
source link: https://buttondown.email/hillelwayne/archive/syntax-highlighting-is-a-waste-of-an-information/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

No, not a waste in general . Syntax highlighting is quite useful. I’m saying it’s a waste of an information channel . Here’s a quick demonstration of what I mean. Here’s 399 squares and one circle. Where’s the circle?

jy67VfM.png!web

Round two. Where’s the circle?

UBBraaU.png!web

Color carries a huge amount of information. Color draws our attention. Color distinguishes things. And we just use it to distinguish syntax.

Nothing wrong with distinguishing syntax. It’s the “just” that bothers me. Highlighting syntax is not always the most important thing to us. The information we want from code depends on what we’re trying to do. I’m interesting in different things if I’m writing greenfield code vs optimizing code vs debugging code vs doing a code review. I should be able to swap different highlighting rules in and out depending on what I need. I should be able to combine different rules into task-level overlays that I can toggle on and off.

I’ve listed some examples of what we could do with this. If this is something that already exists I included a link. Otherwise I included a mockup. Some of the examples have implementation issues beyond what I discussed; they’re just demonstrations of what highlighting could be. All examples are Pythonish unless otherwise noted.

Some Use Cases

Rainbow parenthesis

This is a pretty common one. We can use different colors to mark how nested a set of parenthesis are. From here .

UfIn2uz.png!web

Context Highlighting

Highlight different levels of nesting. From here .

JnqUva2.png!web

Import highlighting

Highlight identifiers imported from a different file.

2a2IVfI.png!web Variations:

  • Highlight imported functions and classes differently
  • Highlight qualified imports
  • Highlight imports from particular trees

Argument Highlighting

Arguments passed into the function are highlighted differently from local variables or global identifiers.

NFJJ7jA.png!web Variations:

  • Carry it through to aliases (if we assign the argument to another value, highlight that too)
  • Highlight local variables only
  • Highlight values that will be assigned to something
  • Highlight variables used in loops

Type Highlighting

Highlight all list variables and integer variables with different colors.

bIfuqmr.png!web

Variations:

  • Highlight all iterables
  • Highlight all functions returning option types
  • Highlight all variables that could be one of two types
  • Highlight all polymorphic types parameterized to integers

Exception Highlighting

Highlight functions that raise errors not caught in their body.

ueeyUfQ.png!web

Variations:

  • Highlight all functions with try blocks
  • Highlight functions that raise user-defined exceptions
  • Highlight functions that raise a specific exception
  • Highlight functions that catch a specific exception

Metadata Highlighting

Highlight functions that were directly called in the bodies of tests that failed in the last test run.

JnIjIvf.png!web

  • Highlight functions without precondition decorators
  • Highlight functions that are part of a certain stacktrace
  • Highlight functions which are defined in our branch but not the master branch

Random other ideas I didn’t mock up

# TODO

Issues

Why aren’t things this way? There’s both essential and coincidental challenges that make fully leveraging color a lot harder than just having syntax highlighting.

First is actually implementing rules. Some of these require access to the code’s AST, some require broader knowledge of the project, some require runtime information. Some of the ideas are even infeasible; accurately tracking aliasing is an open problem for most languages. Syntax highlighting, by contrast, is usually a matter of regexes and hierarchical state machines. That’s how pygments does it. Semantic highlighting would have to be made from scratch for each language.

Second is highlighting conflicts. What if something needs to be colored two things for two different reasons? In syntax highlighting this is less of a problem because you have an ordered list of matchers. But with semantic highlighting we might have dynamic priorities, where rule A is more important to us now while rule B is more important to us later. Things get even more complicated if we have multiple distinct overlays, which themselves can have priority conflicts. Semantic highlighting would need a much more complex design and implementation than simple syntax highlighting does, and adding overlays makes it even more complicated.

Finally, existing editors just aren’t well set up to handle this. Vim’s syntax highlighting is a mess of regular expressions and special cases. VSCode and (I believe) Atom use TextMate grammars , which assume a single canonical tokenization per file. VSCode recently added semantic highlighting but it seems more oriented to augment the existing syntax highlighting, not radically rethink it. I have no idea what Emacs does.

So I think this is something we’ll eventually have, because the potential advantages are too great to ignore forever. But it will take us a long time to get there. Maybe we’ll see it first with toy languages where the AST is simple enough and the expressiveness is low enough to make semantic highlighting easy.

Update for the influx of new readers

This was a newsletter post, you can subscribehere


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK