Abstract - Background on the Ukrainian invasion, as it pertains to libraries
Russia began the invasion of Ukraine on February 24, 2022. This invasion does not just affect people and their sovereignty, but the entire Ukrainian culture is under attack: Russia's bombing targets specifically include Ukrainian libraries and other cultural heritage institutions.
Ukrainian advocacy groups have tried to encourage collecting at U.S. libraries but have been met with resistance, and continued efforts are slow. Many US libraries don't know where to order Ukrainian books, which ones to choose from, or how to catalog them, especially if they don't have Ukrainian speakers on staff.
This project aims to shed light on the landscape of collecting. It is the first step for us to identify and overcome barriers to building Ukrainian collections in US public libraries.
Bibliographic Data At Scale
How do we figure out where the books are? At first, we had the idea to go to WorldCat to check holdings, but the web interface didn't give us enough options for filtering and downloading book data.
So, we chose to go to the back end, using the same tools that catalogers use to find and download MARC records to create our data set. Although research like this isn't common, we found examples of people using it to investigate the prevalence of publications related to Scotland, New Zealand, and Imperial China.
Data Workflow
Here's how our process worked:
- The first step in our process involved searching the OCLC WorldShare Collection Manager to identify books in Ukrainian that we then downloaded as MARC records.
- We then used MARCedit to convert records into Excel format so we could read the OCLC numbers we collected.
- Our third step involved using the OCLC numbers to determine which libraries hold each book using Worldcat Search API. We were rate-limited at this step to 50,000 queries per day, which slowed progress as it took a few weeks to get a complete dataset.
- Using a script, we counted the number of times each library had a book to determine the size of the Ukrainian book collection.
- We then got library-type information and geographic coordinates using the WorldCat Registry API.
- After we combined all that information into a central dataset, we plotted that data using Tableau to create an interactive map of library Ukrainian book collections by size and type.
Challenges Encountered
We came across a few issues during the project:
- We had trouble looking up some libraries because their OCLC identifier codes were too short or contained special characters. However, we were able to find some Python functions that could help parse them.
- We had to fill ~8 missing libraries's location data by hand using Google Maps location data, and we also manually removed some libraries with false positive hits.
- Getting the holdings data took almost two weeks between the API query limits and the number of books we had to look up. We had to re-work one API lookup script to run in batches on the Washington State University supercomputer to avoid losing everything to random restarts to install 'critical' updates.
Next Steps
This map is a starting point for a few things. We want to do a deeper study on how libraries are overcoming challenges in collecting the Ukrainian language, so we'll use the map to help us target a survey of libraries that are collecting well and those that might need some help.
We hope this map can help Ukrainian advocacy groups in the US. It shows them where more books are needed and may help them direct local Ukrainian speakers to reading material.
We also hope this map can be a tool for community engagement. We'd love it if you would try it, let us know what you think, and give us suggestions.
Advocacy Resources
Acknowledgements
- WSU Center for Institutional Research Computing
- Sergey Lapin, Vice Chancellor for Research, WSU Everett
- OCLC
- Ukrainian Association of Washington State
Slidedeck
This is an edited version of the "Exploring Ukrainian Language Collections in US Libraries” presentation at the Association for Computers and the Humanities 2024 conference.