Pretty-print syntax trees with this one simple trick
source link: http://www.haskellforall.com/2020/11/pretty-print-syntax-trees-with-this-one.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Pretty-print syntax trees with this one simple trick
prettyprint
I want to share a simple trick for pretty-printing syntax trees with the correct precedence that I’ve been using in my own interpreter projects. I believe this trick has been shared before, but I don’t know what the name of it is, so I wasn’t able to easily search for prior art. If somebody knows where this idea originated from then I can update this post to credit the original.
To illustrate the trick, I’d like to begin from the following Haskell type for a lambda calculus expression:
data Expression
= Lambda Text Expression
| Application Expression Expression
| Variable Text
… and a matching grammar for parsing such an expression (using the same notation that the happy
package uses):
Expression
: '\\' label '->' Expression { Lambda $2 $4 }
| ApplicationExpression { $1 }
ApplicationExpression
: ApplicationExpression VariableExpression { Application $1 $2 }
| VariableExpression { $1 }
VariableExpression
: label { Variable $1 }
| '(' Expression ')' { $2 }
The trick
We can pretty-print that Expression
type with the correct precedence without having to keep track of any precedence level. Instead, all we have to do is to write the pretty-printer to match the shape of the grammar, like this:
prettyExpression :: Expression -> Text
prettyExpression (Lambda x e) =
"\\" <> x <> " -> " <> prettyExpression e
prettyExpression other =
prettyApplicationExpression other
prettyApplicationExpression :: Expression -> Text
prettyApplicationExpression (Application f x) =
prettyApplicationExpression f <> " " <> prettyVariableExpression x
prettyApplicationExpression other =
prettyVariableExpression other
prettyVariableExpression :: Expression -> Text
prettyVariableExpression (Variable x) =
x
prettyVariableExpression other =
"(" <> prettyExpression other <> ")"
The pretty-printing logic closely follows the grammar
Create one
pretty…
function for each nonterminal symbol in the grammarFor example, since we have a nonterminal symbol named
ApplicationExpression
, we create a matching pretty-printing function namedprettyApplicationExpression
.Each
pretty…
function matches one pattern per alternative in the grammarIn other words, if the production rule for
ApplicationExpression
has two alternatives, then theprettyApplicationExpression
matches two patterns corresponding to each of the two alternatives, respectively.Pretty-print non-terminal symbols using the matching pretty-printing function
For example, we pretty-print a function’s argument using
prettyVariableExpression
since we used theVariableExpression
nonterminal symbol to parse that argument.Pretty-print terminal symbols in the obvious way
… with any necessary whitespace
That’s the entire trick! If you follow those simple rules then the prettyprinter will automatically respect precedence, inserting parentheses in the right places and eliding parentheses when they are not necessary.
Conclusion
There is one major downside to this trick: if you add a new constructor to your syntax tree and you forget to update the pretty-printer then your pretty-printer will infinitely loop. This is pretty annoying as you might imagine.
The main upside to this trick is that pretty-printer logic is very simple to write, so mechanical that you could probably automate it (although I’m not sure if somebody has done so, yet).
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK