Before the class

  1. Please sign on to the Discord server for the course (invitations are in your email.)
  2. Please sign up for a free account on Miro, the virtual whiteboard system that we will be using extensively during the course. (Invite links to our Miro space are in your email.)
  3. Please follow the instructions to install OpenRefine on your own computer.
  4. We have a relatively short reading for each day of the course - as with all classes, it’ll be most helpful for you to do the reading prior to our discussion!

Technology

You’ll need to bring your own computer - Windows, Mac, Linux are all fine.

I’ve tried hard to make sure that this class uses the least amount of tech possible. Almost all of our work will happen in apps that run in your internet browser; only OpenRefine needs to be installed beforehand.

However, Palladio and OpenRefine are not designed to work on a tablet. I strongly recommend that you use a laptop computer to get the most useful experience with those exercises. You may also find it much easier to use them, and the Miro whiteboard app, if you use a computer mouse (but this isn’t required.)

What else do you want to talk about?

I’ve left flex time on Thursday to talk through issues or questions that aren’t covered in the other modules. At any time Monday through Wednesday, use the Open Questions Board to drop notes about concepts or questions you’d like to learn more about, and then based on community voting we’ll hold some informal discussions or tutorials on Thursday.

Software Suggestions

Miro board to collect software suggestions for data tidying, management, and visualization

Location

All Tidy Data classes take place in Williams 616

UPenn campus mini map

Monday, June 13

Reading

Rawson, Katie, and Trevor Muñoz. “Against Cleaning.” In Debates in the Digital Humanities 2019, edited by Matthew K. Gold and Lauren F. Klein, 279–92. University of Minnesota Press, 2019. DOI: 10.5749/j.ctvg251hk.26.

Schedule

Time Activity
9:00am-10:00am Dream Lab Welcome
10:00am-10:45am Tidy Data Orientation & Introductory Exercise
10:45am-11:00am Break
11:00am-12:00pm Data types (text, numbers, dates, categories, missing & uncertain info)
12:00pm-1:30pm Lunch
1:30pm-3:00pm Palladio
3:00pm-3:30pm Break
3:30pm-4:30pm Rawson & Muñoz 2019 discussion
5:15pm-6:45pm Keynote: “Digital Humanities, 1887/2087” Whitney Trettien (Van Pelt Library, 6th Floor, Class of 1978 Pavilion)

Tuesday, June 14

Reading

Merry, Mark. Designing Databases for Historical Research. London: Institute of Historical Research, University of London, 2015. http://port.sas.ac.uk/mod/book/view.php?id=75. (Sections C, D & E)

Schedule

Time Activity
9:30am-10:30am Relational Data Design Intro and discussing Merry 2015
10:30am-10:45am Break
10:45am-12:00pm Data design exercise
12:00pm-1:30pm Lunch
1:30pm-3:00pm Intro to SQL (selecting; ordering; filtering; counting)
3:00-3:30 Break
3:30pm-5:00pm More SQL (creating schemas; joins)

Wednesday, June 15

Reading

Posner, Miriam. “What’s Next: The Radical, Unrealized Potential of Digital Humanities.” Miriam Posner’s Blog (blog), July 27, 2015. https://miriamposner.com/blog/whats-next-the-radical-unrealized-potential-of-digital-humanities/.

Schedule

Time Activity
9:30-10:30am Posner 2015 discussion
10:30am-10:45am Break
10:45am-12:00pm Identifying data messes
12:00pm-1:30pm Lunch
1:30pm-3:00pm OpenRefine
3:00pm-3:30pm Break
3:30pm-5:00pm OpenRefine continued

Thursday, June 16

Reading

Langmead, Alison, and David Newbury. “Pointers and Proxies: Thoughts on the Computational Modeling of the Phenomenal World.” In The Routledge Companion to Digital Humanities and Art History, edited by Kathryn Brown, 358–73. Routledge Art History and Visual Studies Companions. New York: Routledge, 2020. DOI: 10.4324/9780429505188-31. (PDF)

Schedule

Time Activity
9:30am-10:15am Langmead & Newbury 2020 discussion
10:15am-10:30am Break
10:30am-12:00pm Data documentation & preservation
12:00pm-1:30pm Lunch
1:30pm-3:00pm Open session
3:15-5:00pm Dream Lab Wrap-up Session