A Faster Alternative to Jq

347 points - today at 7:12 AM

Source

Comments

regus today at 1:43 PM
Jq's syntax is so arcane I can never remember it and always need to look up how to get a value from simple JSON.
1a527dd5 today at 9:20 AM
I appreciate performance as much as the next person; but I see this endless battle to measure things in ns/us/ms as performative.

Sure there are 0.000001% edge cases where that MIGHT be the next big bottleneck.

I see the same thing repeated in various front end tooling too. They all claim to be _much_ faster than their counterpart.

9/10 whatever tooling you are using now will be perfectly fine. Example; I use grep a lot in an ad hoc manner on really large files I switch to rg. But that is only in the handful of cases.

Kovah today at 7:54 AM
I wonder so often about many new CLI tools whose primary selling point is their speed over other tools. Yet I personally have not encountered any case where a tool like jq feels incredibly slow, and I would feel the urge to find something else. What do people do all day that existing tools are no longer enough? Or is it that kind of "my new terminal opens 107ms faster now, and I don't notice it, but I simply feel better because I know"?
allknowingfrog today at 7:49 PM
I deal with a fair amount of newline-delimited JSON in my day job, where each line in the file is a complete JSON object. I've seen this referred to as "jsonl", and it's not entirely uncommon for logs and other kinds of time-series data dumps. Do any of the popular JSON CLI tools work with this format? I didn't see any mention of it here.
hackrmn today at 8:51 AM
Having used `jq` and `yq` (which followed from the former, in spirit), I have never had to complain about performance of the _latter_ which an order of magnitude (or several) _slower_ than the former. So if there's something faster than `jq`, it's laudable that the author of the faster tool accomplished such a goal, but in the broader context I'd say the performance benefit would be required by a niche slice of the userbase. People who analyse JSON-formatted logs, perhaps? Then again, newline-delimited JSON reigns supreme in that particular kind of scenario, making the point of a faster `jq` moot again.

However, as someone who always loved faster software and being an optimisation nerd, hat's off!

Jenk today at 12:31 PM
I switched to Jaq[0] a while back for the 'correctness' sake rather than performance. But Jaq also claims to be more performant than jq.

[0]: https://github.com/01mf02/jaq

Bigpet today at 7:42 AM
When initially opening the page it had broken colors in light mode. For anyone else encountering it: switch to dark mode and then back to light mode to fix it.
ifh-hn today at 8:24 AM
I learned a number of data processing cli tools: jq, mlr, htmlq, xsv, yq, etc; to name a few. Not to the level of completing advent of code or anything, but good enough for my day to day usage. It was never ending with the amount of formats I needed to extract data from, and the different syntax's. All that changed when I found nushell though, its replaced all of these tools for me. One syntax for everything, breath of fresh air!
jiehong today at 9:18 AM
First of all, congratulations! Nice tool!

Second, some comments on the presentation: the horizontal violin graphs are nice, but all tools have the same colours, and so it's just hard to even spot where jsongrep is. I'd recommend grouping by tool and colour coding it. Besides, jq itself isn't in the graphs at all (but the title of the post made me think it would be!).

Last, xLarge is a 190MiB file. I was surprised by that. It seems too low for xLarge. I daily check 400MiB json documents, and sometimes GiB ones.

throwawaypath today at 2:23 PM
After reading the title, I was worried that this wasn't written in Rust!
maxloh today at 8:33 AM
From their README [0]:

> Jq is a powerful tool, but its imperative filter syntax can be verbose for common path-matching tasks. jsongrep is declarative: you describe the shape of the paths you want, and the engine finds them.

IMO, this isn't a common use case. The comparison here is essentially like Java vs Python. Jq is perfectly fine for quick peeking. If you actually need better performance, there are always faster ways to parse JSON than using a CLI.

[0]: https://github.com/micahkepe/jsongrep

vindin today at 1:15 PM
The data viz of the benchmarks is really rough. I think you’d get a lot of leverage out of rebuilding it and using colors and/or shapes to extract additional dimensions. Nobody wants to scan through raw file paths as labels to try and figure out what the hell the results are
onedognight today at 10:03 AM
Having the equivalent jq expression in these examples might help to compare expressiveness, and it might help me see if jq could “just” use a DFA when a (sub)query admits one. grep, ripgrep, etc change algorithms based on the query and that makes the speed improvements automatic.
Asmod4n today at 11:19 AM
You could just take simdjson, use its ondemand api and then navigate it with .at_path(_with_wildcard) (https://github.com/simdjson/simdjson/blob/master/doc/basics....)

The whole tool would be like a few dozen lines of c++ and most likely be faster than this.

ontouchstart today at 12:37 PM
Everything can be written in JavaScript will be written in JavaScript.

Everything can be rewritten in Rust will be written in Rust.

hilti today at 3:21 PM
I'm glad you adjusted the CSS while I was typing my comment. I needed to switch to dark mode to be able to read highlighted words.

Nice write up. I will try out your tool.

1vuio0pswjnm7 today at 7:02 PM
One problem I have not seen addressed by jq or alterataives, perhaps this one addresses it, is "JSON-like" data. That is, JSON that is not contained in a JSON file

For example, web pages sometimes contain inline "JSON". But as this is not a proper JSON file, jq-style utilties cannot process it

The solution I have used for years is a simple utility written in C using flex^1 (a "filter") that reformats "JSON" on stdin, regardless of whether the input is a proper JSON file or not, into stdout that is line-delimited, human-readable and therefore easy to process with common UNIX utilities

The size of the JSON input does not affect the filter's memory usage. Generally, a large JSON file is processed at the same speed with the same resource usage as a small one

The author here has provided musl static-pie binaries instead of glibc. HN commenters seeking to discredit musl often claim glibc is faster

Personally I choose musl for control not speed

1. jq also uses flex

Voranto today at 12:25 PM
Quick question: Isn't the construction of a NFA - DFA a O(2^n) algorithm? If a JSON file has a couple hundred values, its equivalent NFA will have a similar amount, and the DFA will have 2^100 states, so I must be missing something.
enricozb today at 10:18 AM
I am excited for some alternative syntax to jq's. I haven't given much thought to how I'd write a new JSON query syntax if I were writing things from scratch, but I personally never found the jq syntax intuitive. Perhaps I haven't given it enough effort to learn properly.
mlmonkey today at 3:42 PM
Minor suggestion: often I just want to extract one field, whose name I know exactly. I see that `jg` has an option `-F` like this:

$ cat sample.json | jg -F name

I would humbly suggest that a better syntax would be:

$ cat sample.json | jg .name

for a leaf node named "name"; or

$ cat sample.json | jg -F .name.

for any node named "name".

sirfz today at 12:11 PM
Nowadays I'd just use clickhouse-local / chdb / duckdb to query json files (and pretty much any standard format files)
jrhey today at 7:07 PM
Since when was jq considered slow?
tehnub today at 2:47 PM
I've been using jj, which apparently is also faster than jq https://github.com/tidwall/jj
bouk today at 9:56 AM
I highly recommend anyone to look at jq's VM implementation some time, it's kind of mind-blowing how it works under the hood: https://github.com/jqlang/jq/blob/master/src/execute.c

It does some kind of stack forking which is what allows its funky syntax

steelbrain today at 8:32 AM
Surprised to see that there's no official binaries for arm64 darwin. Meaning macOS users will have to run it through the Rosetta 2 translation layer.
luc4 today at 11:31 AM
Since the query compilation needs exponential time, I wonder how large the queries can be before jsongrep becomes slower than all the other tools. In that regard, I think the library could benefit from some functionality for query compilation at compile-time.
arjie today at 12:54 PM
Thank you. Very cool. Going to try embedding this into my JSON viewer. One thing I’ve struggled with is that live querying in the UI is constrained by performance.
wolfi1 today at 9:57 AM
forgive me my rant, but when I see "just install it with cargo" I immediately lose interest. How many GB do I have to install just to test a little tool? sorry, not gonna do that
keysersoze33 today at 7:57 AM
I was a bit skeptical at first, but after reading more into jsongrep, it's actually very good. Only did a very quick test just now, and after stumbling over slightly different syntax to jq, am actually quite impressed. Give it a try
rswail today at 12:00 PM
Just about to read, but I had to change to dark mode to be able to see the examples, which are bold white on a white background.
stuaxo today at 10:15 AM
Nice.

Some bits of the site are hard to read "takes a query and a JSON input" query is in white and the background of the site is very light which makes it hard to read.

deleted today at 1:09 PM
furryrain today at 8:37 AM
If it's easier to use than jq, they should sell the tool on that.
deleted today at 7:12 AM
skywhopper today at 1:31 PM
If the author cares, I can’t read everything on this page. The command snippets have a “BASH” pill in the top left that covers up the command I’m supposed to run. And then there are, I guess topic headings or something that are white-on-white text, so honestly I don’t know what they say or what they are.
coldtea today at 9:20 AM
Speed is good! Not a big fan of the syntax though.
alexellisuk today at 1:01 PM
Quick comment for the author.

Just added this new tool to arkade, along with the existing jq/yq.

No Arm64 for Darwin.. seriously? (Only x86_64 darwin.. it's a "choice")

No Arm64 for Linux?

For Rust tools it's trivial to add these. Do you think you can do that for the next release?

https://github.com/micahkepe/jsongrep/releases/tag/v0.7.0

PUSH_AX today at 10:30 AM
Is Jq slow?
peterohler today at 12:17 PM
Another alternative is oj, https://github.com/ohler55/ojg. I don't know how the performance compares to jq or any others but it does use JSONPath as the query language. It has a few other options for making nicely formatted JSON and colorizing JSON.
quotemstr today at 8:47 AM
Reminder you can also get DuckDB to slurp the JSON natively and give you a much more expressive query model than anything jq-like.
silverwind today at 10:17 AM
Effort would be better investigated making `jq` itself faster.
commers148 today at 7:02 PM
[dead]
leontloveless today at 12:04 PM
[dead]
mitul005 today at 11:33 AM
[dead]
marxisttemp today at 10:38 AM
Many Useless Uses of cat in this documentation. You never need to do `cat file | foo`, you can just do `<file foo`. cat is for concatenating inputs, you never need it for a single input.
adastra22 today at 9:03 AM
The fastest alternative to jq is to not use JSON.