Matthew Lincoln, PhD Cultural Heritage Data & Info Architecture

jq and SPARQL to CSV

A growing number of museums are creating SPARQL endpoints to release their data, and it’s a great way for researchers to build custom datasets for reuse in their own work.

Most of these services will only return results in XML or JSON formats, when all you were really looking for was a CSV table!

jq to the rescue. This great command line utility1 for filtering and re-writing JSON files can also be put to work converting RDF JSON from a SPARQL endpoint into a CSV that you can load into RAW, plot.ly, R, or whatever your data exploration tool of choice may be. Just run like so:

jq -r '.head.vars as $fields | ($fields | @csv), (.results.bindings[] | [.[$fields[]].value] | @csv)' sparql.json > sparql.csv

jq will first write the vars array to the first line of the CSV, creating a table header. Next, it will read through each result and add a row to the CSV, leaving a blank for any missing variables in a given result.

Warning: jq must read the entire JSON file in to memory, so be careful if you are trying to process a multi-gigabyte file!

  1. The Programming Historian has a great tutorial on the command line


Comments are enabled via Hypothes.is


Cite this post:

Lincoln, Matthew D. "jq and SPARQL to CSV." Matthew Lincoln, PhD (blog), 04 Jun 2015, https://matthewlincoln.net/2015/06/04/jq-and-sparql.html.


Tagged in: Code