Fiesole Collection Development Retreat, “Information Discovery: Examining Enabling Technologies and New Tools,” a Pre-Conference Presentation, April 12, 2012, Fiesole, Italy
by Mary M. Somerville, MLS, MA, PhD (University Librarian and Professor, University of Colorado Denver, and Director, Auraria Library, Denver, CO, USA) <[email protected]>
Discoverability enables scholars to locate the content needed to advance their research and other creative activities. Improved discovery experiences require heightened collaboration among (1) scholarly publishers and their published authors; (2) search engine developers, database providers, abstracting and indexing services, and academic publishers; (3) electronic resource management and integrated library system vendors; and (4) librarians who advance institutional discoverability. To further exploratory cross-sector conversations, SAGE commissioned a White Paper, “Improving Discoverability of Scholarly Content in the Twentieth Century: Collaboration Opportunities for Librarians, Publishers, and Vendors,” released in January 2012. The research report presents highlights of interviews conducted from July to October 2011 with fourteen value chain experts. The White Paper also summarizes results of peer-reviewed publications and proprietary research studies to further characterize the currently fragmented discovery environment. In conclusions, the authors propose cross-sector conversations among publishers, vendors, and librarians to further visibility and, ultimately, usage of the scholarly corpus on the open Web and within library services.
In May 2011, SAGE commissioned a four-month research study which culminated in a White Paper released in January 2012 at the American Library Association Midwinter Meeting. The study was intended by SAGE to benefit “the community” of publishers, vendors, and libraries. Research project outcomes included: 1) discussion of best practices emerging in discovery and access of content in libraries; 2) identification of problems that publishers, librarians, and vendors need to resolve; 3) suggestions for some real solutions that can be implemented by librarians and publishers; and 4) further observations for improving discoverability and visibility of scholarly content in the 21st century.
In addition to a review of published literature and commissioned studies, co-authors interviewed fourteen library, publisher, and vendor industry experts. This value chain ‘convenience sample’ was generated through a concluding interview question: Who else should we talk to, and what else should we read? So the list of authoritative interviewees and information sources “snowballed” organically.
As reported in the SAGE White Paper acknowledgements, experts contributed insights from 1) scholarly publishers and their published authors and journal editors; 2) search engine developers, journal database aggregators, and abstracting and indexing (A&I) services; 3) electronic resource management (ERM) and integrated library system (ILS) vendors; and 4) academic librarians and library consortium leaders who advance institutional and multi-institutional discoverability.
During the course of the study, “discoverability” was defined as scholars’ capacity to locate relevant content in the scholarly corpus as needed to advance their research and other creative activity.”1 Therefore, structured interview questions explored how publishers, libraries, and vendors could collaboratively advance improved discovery of the peer-
reviewed/quality-vetted content that academic publishers produce, libraries invest in, and scholars require at appropriate points in their research workflow.
Analysis of detailed interview notes revealed that, in experts’ opinions, improved discoverability depends on a variety of strategic cross-sector strategies:
- placing discovery acceleration tools in familiar Web environments,
- detailed indexing for highly relevant and precise search results, and
- seamless identification and fulfillment user experiences.
Accomplishing these means of improving user discovery results and experiences require heightened cross-sector collaboration. In other words, discoverability and, relatedly, visibility require a holistic “ecosystem” approach among value chain contributors — because each sector is part of a dynamic, “whole,” interconnected system of information exchange and knowledge creation.
In this symbiotic ecosystem,
- Librarians manage systems for institutional collection, dissemination, and retrieval of the scholarly corpus;
- Publishers produce and promote authors’ work through formats findable on the open Web and in library catalogs;
- Publishers’ technology vendors supply e-publication platforms and strategic discoverability solutions; and
- Libraries’ technology vendors connect publishers’ digital content to online public access catalogs (OPACs) through electronic resource management (ERM) systems and Web-scale discovery services.
Therefore, Web-scale discovery and visibility tools depend on value-added, largely invisible contributions of authors, publishers, libraries, and vendors who compose the scholarly value chain. Traditionally, these content and service providers satisfied complementary roles. Publishers provided gatekeeper services, ensuring peer-reviewed content adjudicated by journal editorial boards. In turn, librarians served as access gatekeepers for the authoritative published resources. However, the Internet has disturbed those comfortable and conventional relationships, thereby necessitating reinvention — and recommitment — of centuries-old partnerships among publishers, scholars, and libraries.
Peer-reviewed journal literature is a primary source of insight, evidence, authority, and attribution in scholarly communication. Traditionally, libraries ensured discoverability and access through a combination of effective cataloging and classification, open and browsable stacks, A&I tools, reference assistance, research consultation, research education, and other services and programs that improved awareness and usage of authoritative information available in and through libraries. Now libraries must re-discover their role(s) amidst increasingly complex workflows, licensure restrictions, statistics analysis, and return-on-investment (ROI) expectations.
Amidst this considerable uncertainty, companies like OCLC, Serials Solutions, ExLibris, and EBSCO are partnering with growing numbers of publishers of primary and secondary content (scholarly corpus and A&I services) to produce simplified, centrally-indexed content. In turn, libraries are increasingly adopting these Web-scale discovery platforms as the preferred interface for library OPACs because they can facilitate local access through a single index that provides relevancy ranking and other facets for achieving precise search results querying content in all formats, whether licensed, owned, or free. SAGE White Paper interviewees recognized that furthering discovery, navigation, and fulfillment experiences requires purposeful conversations and heightened collaborations among all these value chain contributors.
Heightened collaboration among librarians, publishers, and vendors is critically important because, despite a disruptive (and disrupted) information landscape, we share a common purpose: to improve discoverability and visibility, access and delivery, and usage and creation of the scholarly corpus. Analysis of cross-sector expert interview data revealed some initial “conversation starters.” To begin, agreement is needed on common standards for metadata standards, information organization, and resource presentation. Therefore, an especially fruitful conversation would initiate cross-platform and cross-publisher investigations to identify best industry practices, further shared standards, and apply researcher behavior findings. In response, online product interfaces and publisher Website designs would conform to (yet-to-be determined) standards and functionalities.
In addition, enhanced community collaboration would better ensure researcher navigation to the “best” version of scholarly content for which they have “rights” through academic affiliation validated by institutional authentication. This collaborative outcome would build on the Open URL (link resolver) navigation technology that shows users their options for obtaining target content, whether from the primary publisher’s Website, an aggregated collection of content or other options (such as print holdings), interlibrary loan, or document delivery. Functionality is enabled through a combination of technologies and standards and practices, including National Information Standards Organization/NISO and Knowledge Bases and Related Tools/KBART.
Other promising initiatives, such as Open Researcher and Contributor ID/ORCID, aim to provide researchers and other entities with unique identifiers to associate with their research outputs. Version of record is also being addressed to ensure that researchers can “see”/recognize the various incarnations of a journal article through its life cycle of publication and can locate the authoritative and most recent version of a given work. National Information Standards Organization (NISO) has also recommended standard version terms, and CrossRef has released a new feature for version validation, CrossMark.
Meanwhile, Webmasters are increasingly adopting schema such as HTML to construct/mark up Web pages in ways recognized by major search engines, such as Google and Bing, to improve Search Engine Optimization (SEO). When these search providers directly access databases structured by standardized schema, they can improve discovery of relevant Web pages. Building upon this capability, within the scholarship realm, ScholarlyArticle offers a structured data schema to enable improved discovery of appropriate content through consideration of a variety of unique properties, including publisher, editor, reviewer, genre, reviews, ratings, institution, location, creation date, and modification date, as well as author, title, and source — all value-added signifiers of provenance and authority.
Despite progress-to-date, further cross-industry standards are needed for content file formats, structured metadata quality, and online usage statistics to ensure interoperability among search engines, publisher platforms, and integrated library systems, especially as new models for scholarly communicate emerge. With the aim of furthering exchange and creation of scholarship in the digital age, long-time value chain contributors have highly complementary roles to perform, which will surely extend and inevitably re-invent their traditional roles:
- Librarians understand the research and discovery needs of novice and expert researchers,
- Publishers and editors understand the curation, production, and dissemination of scholarly content, and
- Authors and other scholars understand the disciplinary knowledge aspirations and discourse practices of their fields of study.
In addition, newer value chain contributors — libraries’ vendors and publishers’ vendors — will certainly seek and find new applications for their expertise and products:
- Libraries’ vendors and publishers’ vendors understand technological infrastructure of platform, discovery, and organizational tools.
Currently, each of these value chain participants contribute significantly to the cycle of creation, discovery, access, and re-creation that catalyzes and informs production of the scholarly corpus which fuels research and learning. However, none yet sufficiently understand the perspectives — and potential value propositions — of the others.
After the SAGE White Paper went to press in October 2011, two complementary initiatives were launched, and one discussion paper was released, which promise to further real-world collaborations recommended in the SAGE White Paper. In late October, NISO announced a new Open Discovery Initiative which aims to develop formal standards and recommended best practices for “next generation” library discovery services using an aggregated index search of a wide range of resources, licensed and free, from multiple providers. Toward this end, a new NISO committee will convene open discovery libraries, information content providers, and discovery service providers to advance creation of consistent vocabulary and business practices. One anticipated outcome is clarification of exactly which resources are available in uniquely licensed and purchased electronic content and which are indexed in full text or by citations only, or both, and whether the metadata derives from aggregated databases or directly through the full text.2
In a highly complementary action in February 2012, the National Federation of Advanced Information Services (NFAIS) announced a draft Discovery Service Code of Practice for review and comment, in the belief that:
“discovery services have the potential to provide ease of information discovery, access, and use, benefitting not only its member organizations, but also the global community of information seekers. However, the relative newness of these services has generated questions and concerns among information providers and librarians as to how these services meet expectations with regard to issues related to traditional search and retrieval services; e.g., usage reports, ranking algorithms, content coverage, updates, product identification, etc. Accordingly, this document has been developed to assist those who choose to use this new distribution channel through the provision of guidelines that will help avoid the disruption of the delicate balance of interests involved.”3
To also further conversation, in December 2011, OCLC released a discussion document, Libraries at Webscale, which presented views of leading thinkers and writers in the fields of information, education, marketing, and technology, who responded to the question: “what next?” They concluded that:
“big collaboration in the information ecosystem will come not only from broader collaboration across libraries, library groups, consortia, and cooperatives, but increasingly through new, innovative alliances and partnerships across the broader knowledge community — across researchers, publishers, commercial vendors, and Web-scale providers such as Google, Amazon, and Facebook.”4
Noting that an ecosystem thrives through complex relationships and interactions among its members, the document offers several possibilities for building relationships and interactions within a Web-scale information ecosystem:
- Connect users with content regardless of format or where it is stored by creating new models of partnership with all types of content providers,
- Develop new forms of knowledge through dialogue and discourse that are easily distributed, reviewed, and added to the collective collection,
- Build creative “spaces” that encourage collaborations of pure exploration and invention among any ecosystem members, organizations, or groups, and
- Build bridges, links, and tunnels to wells of information that make it easy to find, connect, compare, mix, or mash up all content into any format.5
In concluding, the OCLC report notes that the network of organisms within an ecosystem contributes to its growth and expansion by facilitating adaption, change, and contribution. A critical balance between cooperation and competition generates energy and motivates the evolution of the ecosystem toward higher function, nourishing the entire community. In a Web-scale world, collaborations must both promote sharing and drive innovation.6 As demonstrated in the NISO and NFAIS instance, this will require establishment of shared values and principles that can support cooperation and commerce through partnerships that co-create a vision of the future with content publishers and their platform providers, libraries and their service providers, library consortia, and national and international standards initiatives. “A Web-scale world makes this conversation urgent — and exciting.”7
1. Somerville, M. M., Schader, B. J., & Sack, J. R. Improving the Discoverability of Scholarly Content in the Twenty-First Century: Collaboration Opportunities for Librarians, Publishers, and Vendors. A White Paper commissioned by SAGE. Thousand, Oaks, CA: SAGE, 2012. http://surveys.sagepublications.com/Survey.aspx?surveyid=3431
2. Walker, Jenny. NISO launches new Open Discovery Initiative to develop standards and recommended practices for library discovery services based on indexed search. NISO Press Release, October 25, 2011. http://www.niso.org/news/pr/view?item_key=21d5364c586575fd5d4dd408f17c5dc062b1ef5f
3. Lawlor, B. email to list.niso.org, February 1, 2012. For full text, see: http://info.nfais.org/info/codedraftintroduction.pdf.
4. Libraries at Webscale: A Discussion Document. Dublin, Ohio: OCLC, 2011, p. 31. http://www.oclc.org/ca/en/reports/webscale/default.htm
5. Ibid., p. 32-33.
6. Ibid., p. 33.