

GitHub - skx/go.vm: A simple virtual machine - compiler & interpreter - writ...
source link: https://github.com/skx/go.vm
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

README.md
go.vm
This project is a golang based compiler and intepreter for a simple virtual machine. It is a port of the existing project:
(The original project has a perl based compiler/decompiler and an interpreter written in C.)
You can get a feel for what it looks like by refering to either the parent project, or the examples contained in this repository.
This particular virtual machine is intentionally simple, but despite that it is hopefully implemented in a readable fashion. ("Simplicity" here means that we support only a small number of instructions, and the 16-registers the virtual CPU possesses can store strings and integers, but not floating-point values.)
Installation
Install the code via:
$ go get -u github.com/skx/go.vm
$ go install github.com/skx/go.vm
Once installed there are three sub-commands of interest:
go.vm compile $file.in
- Compiles the given program into bytecode.
go.vm execute $file.raw
- Given the path to a file of bytecode, then interpret it.
go.vm run $file.in
- Compiles the specified program, then directly executes it.
So to compile the input-file examples/hello.in
into bytecode:
$ go.vm compile examples/hello.in
Then to execute the resulting bytecode:
$ go.vm execute examples/hello.raw
Or you can handle both steps at once:
$ go.vm run examples/hello.in
Opcodes
The virtual machine has 16 registers, each of which can store an integer or a string. For example to set the first two registers you might write:
store #0, "This is a string"
store #1, 0xFFFF
In addition to this there are several mathematical operations which have the general form:
$operation $result, $src1, $src2
For example to add the contents of register #1 and register #2, storing the result in register #0 you would write:
add #0, #1, #2
Strings and integers may be displayed to STDOUT via:
print_str #1
print_int #3
Control-flow is supported via call
, ret
(for subroutines) and jmp
for absolute jumps. You can also use the Z
-flag which is set by
comparisons and the inc
/dec
instructions and make conditional jumps:
store #1, 0x42
cmp #1, 0x42
jmpz ok
store #1, "Something weird happened!\n"
print_str #1
exit
:ok
store #1, "Comparing register #01 to 0x42 succeeded!\n"
print_str #1
exit
Further instructions are available and can be viewed beneath examples/. The instruction-set is pretty limited, for example there is no notion of reading from STDIN - however this is supported via the use of traps, as documented below.
Notes
Some brief notes on parts of the code / operation:
The compiler
The compiler is built in a traditional fashion:
- Input is split into tokens via lexer.go
- This uses the token.go for the definition of constants.
- The stream of tokens is iterated over by compiler.go
- This uses the constants in opcode.go for the bytecode generation
The approach to labels is the same as in the inspiring-project: Every time we come across a label we output a pair of temporary bytes in our bytecode. Later, once we've read the whole program and assume we've found all existing labels, we go back up and fix the generated addresses.
You can use the dump
command to see the structure the lexer generates:
$ go.vm dump ./examples/hello.in
{STORE store}
{IDENT #1}
{, ,}
{STRING Hello, World!
}
{PRINT_STR print_str}
{IDENT #1}
{EXIT exit}
The interpreter
The core of the intepreter is located in the file cpu.go and is as simple and naive as you would expect. There are some supporting files in the same directory:
- register.go
- The implementation of the register-related functions.
- stack.go
- The implementation of the stack.
- traps.go
- The implementation of the traps, to be described below.
Changes
Compared to the original project there are two main changes:
- The
DB
/DATA
operation allows storing string data directly in the generated bytecode. - There is a notion of
traps
.- Rather than defining opcodes for complex tasks it is now possible to callback into the CPU-emulator to do work.
DB/DATA Changes
For example in simple.vm project this is possible:
DB 0x01, 0x02,
But this is not:
DB "This is a string, with terminator to follow"
DB 0x00
go.vm
supports this, and it is demonstrated in examples/peek-strlen.in.
Traps
The instruction int
can be used to call back to the emulator to do some work
on behalf of a program. Only two traps are defined right now:
int 0x00
- Set the contents of the register
#0
with the length of the string in register#0
.
- Set the contents of the register
int 0x01
- Set the contents of the register
#0
with a string entered by the user. - See examples/trap.stdin.in.
- Set the contents of the register
Adding your own trap-functions should be as simple as editing cpu/traps.go.
Steve
Recommend
-
44
README.md JVM implementation in Python python-jvm-interpreter is an implementation of the Java Virtual Machine in Python. It works by parsing...
-
14
#Fundamentals The Differences Between Interpreter and Compiler Explained
-
6
The MIR C interpreter and Just-in-Time (JIT) compiler
-
8
Kuroko Kuroko is a dynamic, bytecode-compiled programming language and a dialect of Python. The syntax features indentation-driven blocks, fam...
-
9
Purpose This project aims to accurately compile common programming languages and I/O features into Microsoft Excel (2015) files. How to Use This project is currently incomplete. From the latest release version:...
-
7
Optimizing GoAWK with a bytecode compiler and virtual machine February 2022 Summary: I recently sped up GoAWK by switching from a tree-walking interpreter to a bytecode compiler with a virtual machine interpret...
-
4
What is QBasic/QuickBASIC? QBasic as well as QuickBasic is an easy-to-learn programming language (and therefore ideal for beginners), based on DOS operating system, but also executable o...
-
9
Natural-Language-Interpreter Simple implementation of natural language interpreter in Javascript. This is a simple JavaScript tool that allows for multiple types of user prompts to be "funneled in" to a single command. You can set...
-
5
Favorite compiler and interpreter resources Last updated Jan 5, 2023 My personal path, a hobbyist, was focused at first on interpreters for Brainfuck, Scheme, lower-case forth, and lower-cas...
-
5
Distinguishing an Interpreter from a Compiler January 26 2023 In
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK