Json is everywhere these days and perhaps like me, you may find yourself writing some shell scripts and needing to pull some value out of some json file you have.
Jq is a command line utility that makes this straifghtforward, it's a popular solution that's supported on pretty much every unix platform and comes prepackaged on some OSs (not on OSX unfortunately, but installing it is trivial with homebrew: brew install jq
).
Of course there are other ways to pull data out of json in shell - you could use sed or awk but most of these solutions tend to be a little on the ugly side because those tools are not really tailored specifically for json.
To give you a taste of jq let's write a shell script that grabs the top 10 hacker news stories and displays the title, author and url.
TOP_10_STORIES=$(curl -s https://hacker-news.firebaseio.com/v0/topstories.json | jq -r '.[0:10] | .[]')for i in $TOP_10_STORIESdoSTORY=$(curl -s https://hacker-news.firebaseio.com/v0/item/$i.json)echo "Title: $(echo $STORY | jq -r .title)"echo "User: $(echo $STORY | jq -r .by)"echo "User: $(echo $STORY | jq -r .url)"echo "------------"done
That's it! Pretty short huh - Ok let's break this down.
We start by curling the hacker news topstories endpoint which returns a json array of story ids which looks something like this:
[19801708,19805306,19806188,19810163,19803783,19802914// ...]
First we hand this blob of json of to jq, by piping |
it to our jq command, which in turn looks like this: jq -r '.[0:10] | .[]
.
-r
is a flag which tells jq to strip quotation marks and give us the raw value..[0:10]
is a range filter which slices the array giving us the first 10 elements. The result of which looks like this:
[19801708,19805306,19806188,19810163,19803783,19802914,19810901,19804922,19785500,19802137]
The problem is jq does not stripped out the square brackets, so if we want to iterate over this we will end up looping over the brackets as well. To "unwrap" the values we must pipe the output of the range filter to another filter - .[]
- which will unwrap the values for us.
We can then safely iterate over each id:
TOP_10_STORIES=$(curl -s https://hacker-news.firebaseio.com/v0/topstories.json | jq -r '.[0:10] | .[]')for i in $TOP_10_STORIESdo# do something with each $idone
For each id we make another curl request to retrieve the details for the item. The json payload returned to us looks like this:
{"by": "dhouston","descendants": 71,"id": 8863,"kids": [8952,9224,8917// ..],"score": 111,"time": 1175714200,"title": "My YC app: Dropbox - Throw away your USB drive","type": "story","url": "http://www.getdropbox.com/u/2/screencast.html"}
All that's left now is to print out the properties we're interested in. Extracting a property with jq is super simple: jq -r .myProperty
. So we don't have to make the same request for each property we assign the curl output to a variable
STORY=$(curl -s https://hacker-news.firebaseio.com/v0/item/$i.json)
Create our little property getter expression: echo $STORY | jq -r .title
and interpolate that expression in our prettified string, echoing it out for the user, repeating for each property:
echo "Title: $(echo $STORY | jq -r .title)"echo "User: $(echo $STORY | jq -r .by)"echo "User: $(echo $STORY | jq -r .url)"
The final output should look something like this:
Title: Twisted graphene has become the big thing in physicsUser: furcydUser: https://www.quantamagazine.org/how-twisted-graphene-became-the-big-thing-in-physics-20190430/------------Title: Huge study finds drugs stop HIV transmissionUser: ahakkiUser: https://www.theguardian.com/society/2019/may/02/end-to-aids-in-sight-as-huge-study-finds-drugs-stop-hiv-transmission------------Title: New physics needed to probe the origins of lifeUser: headalgorithmUser: https://www.nature.com/articles/d41586-019-01318-z------------Title: Verizon reportedly seeking to sell TumblrUser: dopppUser: https://techcrunch.com/2019/05/02/verizon-reportedly-seeking-to-sell-tumblr/