How Did They Make That? - Printmaking Networks
Inspired by Miriam Posner’s “How did they make that?” series!
The following are links to various software, services, and resources that I used during my dissertation research.
Data sources
- The Rijksmuseum: JSON-based API (Example output)
- The British Museum: Linked Open Data, accessible as bulk download as well as a SPARQL endpoint.
- Printed books (I know, old school!)
- De Vries, Jan. European Urbanization: 1500-1800. Cambridge: Harvard University Press, 1984.
- van der Waals, Jan. Prenten in de gouden eeuw: van kunst tot kastpapier. Rotterdam: Museum Boijmans Van Beuningen, 2006.
- CSV
- JSON
- N-triples (Linked Open Data)
- curl: download JSON from Rijksmuseum API
- parallel: run lots of curl calls at once, to download from the Rijksmuseum more efficiently
- jq: Parse JSON into CSV files
- fuseki: Graph database to store a local version of the British Museum LOD
- rsync: move data and scripts on and off of Digital Ocean servers
- pandoc: Turn text written in Markdown into PDF
- RStudio: an integrated development environment for R
- Tabula: extracts tabular data from scanned PDFs
- Adobe Acrobat: OCRing PDFs (though this can also be done with open-source Tesseract)
- briss: Brilliant free little tool for cropping scanned PDFs — way more intuitive than Acrobat’s cropping tools.
- Excel: Yes, I use it. Easiest option to hand enter a table with just a few dozen rows
- Zotero: Not technically used for data analysis, but this is my go-to citation manager. I use it in combination with Better BibTeX for formatting all my citations via Markdown/LaTeX.
Languages
- SPARQL: A query language for Linked Open Data. Similar to SQL… but different.
- jq: Not really a language, but you need to learn how to tell jq to turn JSON into the type of table you want
- LaTeX: A rich language for formatting long documents. It is a beast, but still easier than using Word when you have hundreds of pages with sections, citations, images, and a persnickety style guide to follow.
- Markdown: Also not really a language, but an easy-to-use text markup system for writing documents.
Last, and most important:
- R: An open-source language designed for working with tabular data and statistical calculations. Vanilla R can be (kind of weird), but the following packages make it shine:
R Packages
Example code from my dissertation.
- readr: reads in massive CSV/TSV files very quickly, and with the correct variable types (e.g.
character
, numeric
, boolean
)
- dplyr: filter, group, aggregate, join, and run operations on tabular data with easy-to-use syntax and impressive speed. Without exaggeration, this may be the most important extension ever written for R.
- tidyr: Transform between wide and narrow data tables (don’t worry, it’s a thing that starts to make sense once you begin to work with tabular data a lot)
- lubridate: seamlessly parses many different ways for writing date strings
- stringr: string manipulation functions, like regular expressions.
- igraph: Network analysis package (also available in python and C). This package constructs graphs from edge lists, offers a wide range of functions for measurement, simulation, and plotting as well.
- doParallel: Helps set up parallel R sessions so you can run multiple jobs at the same time, and collect all their results in one place.
- clipr: A little utility package I wrote for quickly sending R results to my clipboard for pasting elsewhere, such as Palladio.
- ggplot2: creates beautiful 2D plots for both screen and page.
- animation: Makes animated GIFs from ggplot2.
Services
- Digital Ocean: cloud hosting service for quickly spinning up a lot of processors to run R jobs in parallel for relatively low $$$. This was the only software I actually had to directly pay for.
Books
- Hanneman, Robert A., and Mark Riddle. Introduction to Social Network Methods. Riverside: University of California, Riverside, 2005. http://faculty.ucr.edu/~hanneman/nettext/.
- Prell, Christina. Social Network Analysis: History, Theory and Methodology. Los Angeles: Sage, 2011.
- Arnold, Taylor, and Lauren Tilton. Humanities Data in R: Exploring Networks, Geospatial Data, Images, and Text. Cham: Springer, 2015.