Computer-Aided Metadata for Photo Archives Initiative
I haven’t gotten to think as much about visual culture in my current job as I did when I was at the Getty Research Institute. That’s one of the many reasons I was delighted to team up with our archivists Julia Corrin and Emily Davis (along with Scott Weingart, as always) to experiment this summer with computational interventions in managing the sprawling CMU General Photograph Collection.
“CAMPI”, or “Computer-Aided Metadata for Photo Archives Initiative” was a three-month long prototype sprint. Rather than build out a public production service or create a complete dataset - tasks too ambitious for our short timescale - we wanted to test out several workflows and report out those findings to the broader community. This report was our product, then. Aimed at informed archivists and GLAM technologists, we tried to describe our particular issues, our implementation-agnostic approach to solving them, and then our views on future needs and priorities for any organization that wants to try and implement similar kinds of workflows. Quoting from our executive summary:
- Supporting a fully-featured Digital Asset Management System (DAMS) that connects collection- level metadata to individual photographs and allows easy browsing of existing images and metadata is a crucial prerequisite to any computer-vision-related project in this domain.
- While generic automated image description performs poorly in the context of historical photo archives, combining generic visual search with existing collections metadata can greatly speed the item-level description work carried out by archivists and metadata experts.
- There is a field-wide need for specialized computer vision training sets based on historical, non- born-digital photograph collections. Interfaces such as this prototype can play an essential role in developing those human-tagged datasets that will power advances in both machine learning research as well as discovery and access in libraries, archives, and museums.
- User interface design for computer-vision-assisted metadata generation systems is just as important, if not more so, than creating ever-more advanced machine learning algorithms.
As with almost all our projects, I’ve published our implementation code for you to look at. As we make clear in the whitepaper and in our code documentation, this is not yet a codebase that can or should be freely re-used in other contexts. This three-month sprint was not enough to work out what the most useful abstractions and APIs would be that would generalize across different combinations of archival management systems, digital asset management systems, and controlled vocabularies. My code really just works for this batch of files as of now.
However, I think the overall architecture can be reused by others who may have different implementations, but the same basic functional systems and needs. I intentionally included an appendix to our report that walks through a high-level system architecture - a roadmap for future, more generalizable implementations. Thanks to helpful conversations on Twitter and with the Library of Congress, I already have quite a few different ideas for implementing visual search that would modify the overall architecture. But I think it remains a useful contribution for thinking about what computer vision capabilities as longer-term infrastructure, rather than localized projects, could look like for digitized cultural heritage collections.