Images & Data from the Rijksmuseum via BitTorrent
(update 2015-10-20: I’ve reworked the file structure of this dump to reduce the torrent file size to 1.5MB)
(update 2015-10-21: The Michelle Smith Collaboratory is now seeding a full copy of this dataset. The more people who download this and continue to seed it, the easier it will be for everyone to get a copy!)
Museum APIs are all the rage. They are wonderful for building web or mobile apps that present small selections of images on demand.
But they really suck at delivering bulk data. (Yeah, I’ve been a grump about this before.)
I’ve wasted days of my life cranking out scripts to do this for my dissertation research, and I’d rather it not all go to waste. So, I’ve assembled the JSON object data as well as all available web images as a torrent you can download here.
- The collections data of 515,802 objects are in one JSON file (1.7 GB uncompressed - I recommend jq for trawling it.)
- The 218,442 images are about 164 GB, and average around 2500 pixels on the longest side.
This dataset was developed using the Rijksmuseum’s API, with images from the Rijksmuseum Collection as downloaded in October 2015, so be aware that it won’t reflect later changes.
As for licensing, you’re in the clear:
All data and all images made available through the API are either in the public domain or are subject to a CC0 license. The data and images are royalty-free and may be copied, distributed, modified and used without the permission of the Rijksmuseum.