by Adam T. Beauchamp (Humanities Librarian, Florida State University Libraries)
Librarians in the modern American university rarely have time to curate their monograph collections by selecting books title-by-title. Busy engaging with their research and learning communities through outreach, instruction, and other activities, we rely on systems and workflows to organize and automate much of the acquisitions process. For decades that system has been the approval plan. Mirela Roncevic (2017) compares approval plans to the Sorting Hat in J. K. Rowling’s Harry Potter series. The same way the Sorting Hat channels students into magical houses that best match — and can make use of — their personal qualities, so too does the approval plan classify and sort books to fit the similarly described needs of participating libraries.
Thousands of libraries, including Florida State University Libraries, use the approval plan maintained by GOBI Library Solutions. GOBI “profilers” apply detailed metadata to each book they receive and make it possible for libraries to describe their acquisitions needs in fine detail. However, this complex and powerful algorithm should not be taken for granted. As with any profiling system, the approval plan’s data infrastructure helps librarians build collections while also shaping what collections are possible. Thus, it requires regular assessment to ensure that it continues to meet our needs. At FSU Libraries, we use analytics to discover patterns in the use of our collections, make predictions about our future needs, and transform the way we build our monograph collection.
The Case of FSU Libraries
The 2018-2019 academic year was a time of transition for collection development at FSU Libraries. It had been three years since the last thorough review of our approval plan with GOBI Library Solutions. Since that review, monograph acquisitions had become both more complex and less transparent. FSU Libraries had implemented demand-driven acquisitions (DDA), evidence-based acquisitions (EBA) programs, and invested in two eBook subscription packages. However, our faculty had begun to notice gaps in coverage in our local holdings, putting increased pressure on subject librarians to spend time filling in those gaps with title-by-title selections. This situation combined with personnel changes in key roles for collection development meant it was time for a fresh look at the library’s acquisitions strategies.
Our review process, dubbed the Monograph Ordering Review Project (MORP), included three major steps. First, following many examples from the literature, we analyzed circulation, interlibrary loan, and total holdings data to better understand past monograph use and to predict our future needs (Ochola 2002; Mortimore 2006; Knievel, Wicht, and Connaway 2006; Bronicki et al. 2015). Second, we applied that understanding to redesigning our profile in the GOBI approval plan. Our goal was to improve subject coverage in key areas while minimizing the time librarians spent selecting books title-by-title. Finally, based on test data provided by GOBI, we analyzed the books that would have matched the revised profile had it been in effect the previous fiscal year to assess our predictions and make further refinements.
Our first major challenge in this process was with the structure of our data. Our usage statistics needed to share at least one common data point with the GOBI approval plan. While GOBI’s rich proprietary metadata is invaluable in the profiling process, the circulation data available to us was limited to a subset of MARC record and patron data fields. And since not all books at FSU Libraries were purchased through the GOBI system, the only data point common to both our circulation data and the GOBI profile is Library of Congress Classification (LCC) number.
We also found it necessary to limit our analysis to print book circulations. While the COUNTER standards that regulate eBook statistics normalize measures of access, the descriptive metadata attached to each eBooks vary considerably across platforms and over time. Importantly, most of our eBook data did not include call numbers. We were also uncomfortable mixing eBook access clicks with print circulations since these represent qualitatively different patron interactions with the materials. Despite being currently entangled in multiple eBook acquisition plans, print books were still an important format during the years under analysis, especially for our monograph “power users” in the arts and humanities. We plan to conduct a separate assessment of the eBook environment at a later date.
Needing to craft a more precise approval profile, we analyzed four years of usage data not just at level of the LCC classes and subclasses, but rather we used the detailed call number ranges that also structure the GOBI approval profile. We applied this method to three types of usage: local print circulation, InterLibrary Loan (ILL) requests, and requests through UBorrow, a consortial borrowing program among the State of Florida’s public universities and colleges. Since each of these means of access requires a different amount of effort and know-how on the part of the patron, we considered them distinct measures and did not try to combine them into a single statistic.
For each type of use, we produced three distinct measures. First, we calculated the average number of annual circulations, interlibrary loans, and UBorrow requests for the four years under review. Using averages rather than raw numbers takes into account the effect unique course offerings may have on the use of particular subject areas in any given semester. Second, each of these counts was transformed into a percentage of that type of use. For example, we had an average circulation of 671 books from the World War II section of the D call numbers. This amounts to 0.51% of the total annual circulations averaged over four years. While this percentage is interesting, it is not sufficient to compare across LC call number ranges since these ranges vary widely in the number of books on the shelf and new publications available for purchase. To account for this, we divided the percent of use type by the percent of total items on the shelves for that call number range, thereby controlling for collection size. Returning to our example of books about World War II, FSU Libraries had 7,887 total books in that call number range, which represents 0.68% of our total holdings. So we divided the percent of circulations, 0.51, by the percent of total holdings, 0.68, to arrive at a ratio of 0.75, or 3:4, meaning this section takes up a slightly larger share of shelf space than it does of total circulations, but on the whole seems well-used relative to its size.
We did not expect each call number range to achieve a balanced ratio of 1:1 uses to holdings, but rather used the ratios to compare across ranges and identify areas that seemed to be in particularly high or low demand relative to their size. On the spreadsheet we shared with all subject librarians, we color coded the ratios using percentiles to help draw the eye to areas in need of review (Figure 1). The same calculations were done for ILL and UBorrow requests, though in these cases a high ratio of use to collection size suggested inadequate local holdings, and the color coding was reversed to show these higher ratios in red.
Figure 1: Snippet of the Excel spreadsheet showing the four-year average of total items, three use types, and their respective percentages and ratios.
For step two of the review project, the MORP team provided training to subject librarians on how to read and interpret the admittedly complex dataset of percentages and ratios. Subject librarians also compared this evidence of monographic demand to the curricula and research strengths of their respective liaison areas. Then, in summer of 2019, Katy Ginanni, Collection Development Manager at GOBI, visited campus to help us make significant adjustments to our approval profile.
With the approval profile updated, however, our analytical approach to book acquisitions was not yet complete. Because our usage statistics relied on Library of Congress Classifications, the resulting picture of our needs was somewhat one-dimensional. The LCC system can only describe each book in terms of one subject discipline and topic. More importantly, the system itself is built around Eurocentric biases and idiosyncratic ways of organizing knowledge which has the potential to isolate and exclude important works (Olson 2001; Drabinski 2019). For example, historian Jana Lipman’s 2008 book, Guantánamo: A Working-class History Between Empire and Revolution, would be of potential interest to a wide range of scholars and students, but the Library of Congress classifies it with Naval Science (call number VA 68). Libraries not generally interested in naval science topics would miss this book if they relied solely on call number ranges to make purchasing decisions.
The corrective to this limited perspective came in the final step of the project when we received test data from GOBI in fall 2019. The GOBI team ran our updated approval profile against their database of publications for the previous fiscal year. The resulting list of books included thousands of titles that we had not seen under our original profile. More importantly, this dataset also included more descriptive data than just the LCC call number ranges, allowing us to evaluate the results of our profile changes in great detail. As Jon Elwell, Director of Content Strategies at EBSCO, explains, “profilers” at GOBI with subject expertise add robust “metadata enhancements” to thousands of books every year, adding descriptions of quality, audience indicators, interdisciplinary categories, and more that go beyond Library of Congress Classification and subject headings (2019). With the clever use of pivot tables, subject librarians were able to analyze these data in creative ways, making further refinements to the approval profile while also making predictions about future spending and budget needs.
In the end, the Monograph Ordering Review Project (MORP) at FSU Libraries was a successful application of analytics to the assessment and reform of our approval plan. Thinking through our datasets in terms that applied to both our past usage statistics and future ordering helped us manage the scale of the project and to use consistent terms and measures throughout the project. We identified parts of the local collection in need of more development while also finding opportunities to redeploy the non-print monograph solutions, like publisher-sponsored EBA programs for large, popular imprints. The detailed analysis of print circulations may also help us plan for areas of increased growth while making strategic decisions about weeding and offsite storage solutions.
Analytics: A Closer Look
At FSU and libraries of all sizes across the country, the GOBI approval plan drives the monograph identification and selection process that builds our collections. For many, the approval plan is essential for managing the more than 65,000 new English-language titles published every year (Elwell 2019, 58). But given the major role that the data structures and decision trees of this system play in shaping monograph acquisitions, librarians should not take the approval plan for granted. GOBI’s staff determines the universe of books brought into the system, decides which traits become metadata, and defines each book in those terms. Participating libraries must in turn describe their needs according to the categories and preset values of this system. It would be hard to build library collections without this infrastructure, but we must recognize how these structures determine what collections are possible.
In this context, it is worth revisiting Roncevic’s comparison of approval plans to the Sorting Hat from J. K. Rowling’s Harry Potter series. When placed on the head of each new student at Hogwarts School of Witchcraft and Wizardry, the Sorting Hat makes a judgment in order to assign students to one of four houses. The criteria and the process are magically opaque, but the Hat identifies each student’s qualities and potential and then matches those individual traits to the distinctive character — or profile — of each house. Readers of the series know that this supernatural algorithm has been working for centuries; however, the first time we see it in action, Harry Potter resists it. The Hat wants to place him in Slytherin House, but Harry rejects the Hat’s initial reading and asks to be placed in Gryffindor House instead. What the Hat “knows” about Harry makes him a good candidate for Slytherin, but based on Harry’s experiences and what he knows about himself, Harry makes a different choice.
At first glance, the stakes seem much higher for Harry Potter than for the books that flow through GOBI’s less magical sorting process. Sorting humans and thus limiting their future choices is serious business — a fact I encourage my colleagues pursuing learning analytics projects to remember — but the classification systems we use to sort and build library collections is not without consequences. Library of Congress Classification may be a particularly flawed example, but no single facet can capture objectively the content and potential uses of a book. No matter which categories we deploy, the metadata will encode what Bowker and Starr refer to as “moral and aesthetic choices” (2000, 4). In the case of collection development, these choices have the effect of deciding which books are relevant and appropriate for our communities, and which are not. Simply increasing the number and types of categories we use to inform our acquisitions, as in the “enhanced metadata” provided by GOBI profilers, cannot neutralize each of these moral acts of classification. Multiplying the classifications increases the complexity of the profiling process and may make the consequences more difficult to trace. However, so long as we recognize the potential impact this network of categories and classifications has on our collection development, it also gives us greater flexibility in our decision making. With regular monitoring and strategic, critical interventions by subject librarians, a sophisticated approval plan remains an important solution to managing the complexities of monograph acquisitions.
Bowker, Geoffrey C., and Susan Leigh Star. 2000. Sorting Things Out: Classification and Its Consequences. Inside Technology. Cambridge, Mass.: MIT Press.
Bronicki, Jackie, Irene Ke, Cherie Turner, and Shawn Vaillancourt. 2015. “Gap Analysis by Subject Area of the University of Houston Main Campus Library Collection.” The Serials Librarian 68 (1-4): 230-42. https://doi.org/10.1080/0361526X.2015.1017717.
Drabinski, Emily. 2019. “What Is Critical about Critical Librarianship?” Art Libraries Journal 44 (2): 49-57. https://doi.org/10.1017/alj.2019.3.
Elwell, Jon T. 2019. “Library Analytics: Shaping the Future — Call It What You Want: When Developing Your Book Collection ‘Your Outcomes Are Only as Good as The Data You Feed It.’” Against the Grain 31 (2): 58-59.
Knievel, Jennifer E., Heather Wicht, and Lynn Silipigni Connaway. 2006. “Use of Circulation Statistics and Interlibrary Loan Data in Collection Management.” College & Research Libraries 67 (1): 35-49.
Mortimore, Jeffrey M. 2006. “Access-Informed Collection Development and the Academic Library.” Collection Management 30 (3): 21-37. https://doi.org/10.1300/J105v30n03_03.
Ochola, John. 2002. “Use of Circulation Statistics and Interlibrary Loan Data in Collection Management.” Collection Management 27 (1): 1-13. https://doi.org/10.1300/J105v27n01_01.
Olson, Hope A. 2001. “The Power to Name: Representation in Library Catalogs.” Signs 26 (3): 639-68.
Roncevic, Mirela. 2017. “The Approval Plan: A Sorting Hat That Discovers the Right Books for the Right Libraries.” No Shelf Required (blog). April 12, 2017. https://www.noshelfrequired.com/the-approval-plan-a-sorting-hat-that-discovers-the-right-books-for-the-right-libraries/.