72

PHP Parser Written in Go

 5 years ago
source link: https://www.tuicool.com/articles/hit/yYF3U3u
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

PHP Parser written in Go

jYrIzeb.jpg!web

This project uses goyacc and golex libraries to parse PHP sources into AST . It can be used to write static analysis, refactoring, metrics, code style formatting tools.

Try it online: demo

Features:

  • Fully support PHP 5 and PHP 7 syntax
  • Abstract syntax tree (AST) representation
  • Traversing AST
  • Namespace resolver
  • Able to parse syntax-invalid PHP files

Roadmap

  • Pretty printer
  • Control Flow Graph (CFG)
  • PhpDocComment parser
  • Stabilize api

Install

go get github.com/z7zmey/php-parser

CLI

php-parser [-php5 -noDump] <path> ...

Dump AST to stdout.

Example

package main

import (
	"fmt"
	"bytes"
	"os"

	"github.com/z7zmey/php-parser/php7"
	"github.com/z7zmey/php-parser/visitor"
)

func main() {
	src := bytes.NewBufferString(`<? echo "Hello world";`)

	parser := php7.NewParser(src, "example.php")
	parser.Parse()

	for _, e := range parser.GetErrors() {
		fmt.Println(e)
	}

	visitor := visitor.Dumper{
		Writer:    os.Stdout,
		Indent:    "",
		Comments:  parser.GetComments(),
		Positions: parser.GetPositions(),
	}

	rootNode := parser.GetRootNode()
	rootNode.Walk(visitor)
}

Namespace resolver

Namespace resolver is a visitor that resolves nodes fully qualified name and saves into map[node.Node]string structure

  • For Class , Interface , Trait , Function , Constant nodes it saves name with current namespace.
  • For Name , Relative , FullyQualified nodes it resolves use aliases and saves a fully qualified name.

Parsing syntax-invalid PHP files

If we try to parse $a$b; then the parser triggers error 'syntax error: unexpected T_VARIABLE'. Token $b is unexpected, but parser recovers parsing process and returns $b; statement to AST, because it is syntactically correct.

Pretty printer [work in progress]

nodes := &stmt.StmtList{
	Stmts: []node.Node{
		&stmt.Namespace{
			NamespaceName: &name.Name{
				Parts: []node.Node{
					&name.NamePart{Value: "Foo"},
				},
			},
		},
		&stmt.Class{
			Modifiers: []node.Node{
				&node.Identifier{Value: "abstract"},
			},
			ClassName: &name.Name{
				Parts: []node.Node{
					&name.NamePart{Value: "Bar"},
				},
			},
			Extends: &stmt.ClassExtends{
				ClassName: &name.Name{
					Parts: []node.Node{
						&name.NamePart{
							Value: "Baz"
						},
					},
				},
			},
			Stmts: []node.Node{
				&stmt.ClassMethod{
					Modifiers: []node.Node{
						&node.Identifier{Value: "public"},
					},
					MethodName: &node.Identifier{Value: "greet"},
					Stmt: &stmt.StmtList{
						Stmts: []node.Node{
							&stmt.Echo{
								Exprs: []node.Node{
									&scalar.String{Value: "'Hello world'"},
								},
							},
						},
					},
				},
			},
		},
	},
}

file := os.Stdout
p := printer.NewPrinter(file, "    ")
p.PrintFile(nodes)

It prints to stdout:

<?php
namespace Foo;
abstract class Bar extends Baz
{
	public function greet()
	{
		echo 'Hello world';
	}
}

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK