GitHub - wader/jqjq: jq implementation of jq
source link: https://github.com/wader/jqjq
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
jq implementation of jq
Warning this project is mostly for learning, experimenting and fun.
Why? It started when I was researching how to write decoders directly in jq for fq which ended up involving some syntax tree rewriting and walking and then it grow from there.
But it's also a great way to promote and show that jq is a very expressive, capable and nice language! :)
Use via jqjq
wrapper
$ ./jqjq -n 'def f: 1,8; [f,f] | map(.+105) | implode'
"jqjq"
$ ./jqjq '.+. | map(.+105) | implode' <<< '[1,8]'
"jqjq"
# jqjq using jqjq to run above example
# eval concatenation of jqjq.jq as a string and example
$ ./jqjq "eval($(jq -Rs . jqjq.jq)+.)" <<< '"eval(\"def f: 1,8; [f,f] | map(.+105) | implode\")"'
"jqjq"
$ ./jqjq --repl
> 1,2,3 | .*2
2
4
6
> "jqjq" | explode | map(.-32) | implode
"JQJQ"
> "jqjq" | [eval("explode[] | .-32")] | implode
"JQJQ"
> ^D
# 01mf02 adaptation of itchyny's bf.jq running fib.bf
$ ./jqjq -n "\"$(cat fib.bf)\" | $(cat bf.jq)"
"1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233"
$ ./jqjq -h
jqjq - jq implementation of jq
Usage: jqjq [OPTIONS] [--] [EXPR]
--jq PATH jq implementation to run with
--lex Lex EXPR
--no-builtins No builtins
--null-input,-n Null input
--parse Lex and parse EXPR
--repl REPL
--run-tests Run jq tests from stdin
Use with jq
$ jq -n -L . 'include "jqjq"; eval("def f: 1,8; [f,f] | map(.+105) | implode")'
"jqjq"
$ jq -L . 'include "jqjq"; eval("(.+.) | map(.+105) | implode")' <<< '[1,8]'
"jqjq"
Run tests
make test
Progress
-
123, .123, 1.23, 1.23e2, 1.23e+2, "abc", true, false, null
Scalar literals- Unicode codepoint escape
"\ud83d\ude03"
- Control code and quote escape
"\"\n\r\t\f\b\\\/"
- Unicode codepoint escape
-
{key: "value"}
Object literal-
{key}
-
{"key"}
-
{$key}
-
{(...): ...}
-
{("a","b"): (1,2), c: 2}
Multiple key/value outputs -
{"\()"}
-
{key: 1 | .}
Multi value queries
-
-
[1,2,3]
Array literal, collect -
1, 2
Comma operator -
1 | 2
Pipe operator -
+
,-
,*
,/
,%
Arithmetic operators -
+123
,-1
Unary operators -
==
,!=
,<
,<=
,>
,>=
Comparison operators -
123 as $a | ...
Binding-
(1,2,3) as $a | ...
Binding per output -
{a: [123]} as {a: [$v]}
Destructuring binding
-
-
.
Identity -
.key[123]."key"[f]
Index-
.a
,.["a"]
Simple index -
."key"
-
.a.b
Multi index -
.a?
Optional index -
.a[]
Iterate index
-
-
.[]
Iterate -
.[]?
Try iterate -
.[start:stop]
,.[:stop]
,.[start:]
Array slicing-
.[{start: 123, stop: 123}]
Slice using objec - Slice and path tracking
path(.[1:2]) -> [{"start":1,"end":2}]
-
-
and
,or
operators -
not
operator -
if f then 2 else 3 end
Conditional-
if f then 2 end
Optional else -
if f then 2 elif f then 3 end
Else if clauses -
if true,false then "a" else "b" end
Multiple condition outputs
-
-
reduce f as $a (init; update)
Reduce output -
foreach f as $a (init; update; extract)
Foreach output, update state and output extracted value- Optional extract
-
f = v
Assignment -
f |= v
,f +=
Update assignment -
+=
,-=
,*=
,/=
,%=
Arithmetic update assignment -
eval($expr)
-
input
,inputs
- Builtins / standard library
-
del(f)
-
add
-
all
,all(cond)
,all(gen; cond)
-
any
,any(cond)
,any(gen; cond)
-
debug
(passthrough) -
delpaths($paths)
(passthrough) -
empty
(passthrough) -
endswith($s)
-
error($v)
(passthrough) -
error
(passthrough) -
explode
(passthrough) -
first(f)
-
first
-
flatten
,flatten($depth)
-
from_entries
-
fromjson
(passthrough) -
getpath(path)
(passthrough) -
group
,group_by(f)
-
has($key)
(passthrough) -
implode
(passthrough) -
isempty
-
join($s)
-
last(f)
-
last
-
length
(passthrough) -
limit($n; f)
-
map(f)
-
max
,max_by(f)
-
min
,min_by(f)
-
nth($n; f); nth($n)
-
range($to)
,range($from; $to)
,range($from; $to; $by)
-
recurse
,recurse(f)
-
repeat
-
reverse
-
scalars
-
select(f)
-
setpath
(passthrough) -
sort
,sort_by(f)
-
startswith($s)
-
to_entries
-
tojson
(passthrough) -
tonumber
(passthrough) -
tostring
(passthrough) -
match($regex; $flags)
(passthrough) -
match($val)
-
gsub($regex; f)
(passthrough) -
gsub($regex; f; $flags)
-
transpose
-
type
(passthrough) -
unique
,unique_by(f)
-
until(cond; next)
-
while(cond; update)
-
with_entries
- Math functions,
sin/0
, ...atan/2
, ... - More...
-
-
def f: .
Function declaration-
def f(lambda): lambda
Lambda argument -
(def f: 123; f) | .
Closure function -
def f: def _f: 123; _f; f
Local function -
def f($binding): $binding
Binding arguments -
def f: f;
Recursion
-
-
path(f)
Output paths forf
for input -
try f
,try f catch .
Catch error -
f?
Empty shorthand catch -
..
Recurse input -
//
Alternative operator -
?//
Alternative destructuring operator -
$ENV
-
"\(f)"
String interpolation -
@format "string"
Format string -
label $out | break $out
Break out -
include "f"
,import "f"
Include - Run jqjq with jqjq
jq's test suite
$ ./jqjq --run-tests < ../jq/tests/jq.test | grep passed
245 of 362 tests passed
Note that expected test values are based on stedolan's jq. If you run with a different jq implementation like gojq some tests might fail because of different error messages, support for arbitrary precision integers etc.
Design problems, issues and unknowns
- Better parser errors.
- The "environment" pass around is not very efficient and also it make support recursion a bit awkward (called function is injected in the env at call time).
- "," operator in jq (and gojq) is left associate but for the way jqjq parses it creates the correct parse tree when it's right associate. Don't know why.
- Suffix with multiple
[]
outputs values in wrong order. - Non-associate operators like
==
should fail, ex:1 == 2 == 3
. - Object are parsed differently compared to gojq. gojq has a list of pipe queries, jqjq will only have one that might be pipe op.
- Less "passthrough" piggyback on jq features:
reduce/foreach
via recursive function? similar toif
or{}
-literal?try/catch
via some backtrack return value? change[path, value]
to include an error somehow?
- How to support
label/break
? - How to support
delpaths
(usd bydel
etc). Have to keep paths same while deleting a group of paths? use sentinel value? work with paths instead? - Rewrite AST before eval, currently
if
and some other do rewrite (optional parts etc) while evaluating. - Rethink invalid path handling, current
[null]
is used as sentinel value. {a:123} | .a |= empty
should remove key.
Useful references
Tools and tricks
jq -n --debug-dump-disasm '...'
show jq byte codejq -n --debug-trace=all '...'
show jq byte code run tracejq -n '{a: "hello"} | debug' 2> >(jq -R 'gsub("\u001b\\[.*?m";"") | fromjson' >&2)
pretty print debug messagesGOJQ_DEBUG=1 go run -tags gojq_debug cmd/gojq/main.go -n '...'
run gojq in debug modefq -n '".a.b" | _query_fromstring'
gojq parse tree for stringfq -n '{...} | _query_tostring'
jq expression string for gojq parse treeFor a convenient jq development experience:
Thanks to
- stedolan for jq and got me interesting in generator/backtracking based languages.
- pkoppstein for writing about jq and PEG parsing.
- itchyny for jqjq fixes and gojq from which is learned a lot and is also from where most of jqjq's AST design comes from. Sharing AST design made it easier to compare parser output (ex via fq's
_query_fromstring
). gojq also fixes some confusing jq bugs and has better error messages which saves a lot of time. - Michael Färber @01mf02 for jaq and where I also learned about precedence climbing.
License
Copyright (c) 2022 Mattias Wadman
jqjq is distributed under the terms of the MIT License.
See the LICENSE file for license details.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK