By Donald T. Hawkins (Freelance Conference Blogger and Editor)
The Society for Scholarly Publishing’s (SSP’s) 45th Annual Meeting in Portland, OR had 712 attendees, 611 in person and 101 online (I attended online). According to a recent report in The Scholarly Kitchen, several of the “chefs were in substantial agreement that the main concern in the industry today centers on research integrity and issues surrounding it, such as
- Individuals’ responsibility to improve integrity at the community level,
- The role that artificial intelligence (AI) can play,
- Insufficient responses to threats to trust in science, and
- The need to develop new business models to ensure equity.
The SSP keynote speaker, Dr. Elisabeth Bik is an award winning microbiologist who searches the literature to identify improperly duplicated images and plagiarized text. She has reported over 6,000 papers, and her work has resulted in over 900 retracted and another 900 corrected papers. She received the 2021 John Maddox Prize which is given to individuals who have shown courage and integrity in standing for sound science. Her SSP keynote address was entitled “Double Trouble: Inappropriate Image Duplications in Biomedical Publications”.
Bik defined the three elements of the conference theme:
- Transformation: how scientific publishing has changed recently.
- Trust: Can we still trust every paper?
- Transparency: How transparent are authors’ data?
Because of increasing pressure to publish, there are now 300,000 papers on COVID. PubMed publishes about 2 million papers/year. Do we need all of them? Publications are the foundations of science because researchers build on each other’s work and report on what they find. Science and publications can be likened to bricks on which other papers rest. If a paper contains fraud, part of the wall will come tumbling down. Fraud has been increasing, and science is not immune to it. Cheating is not science because science is about finding the truth.
Science misconduct takes 3 forms:
- Plagiarism: copying without giving credit,
- Falsification: changing measurements to fit a hypothesis or omitting outliers, and
- Fabrication: making up results without any measurements.
Why do scientists commit fraud? Papers usually have multiple authors, so it can be difficult to determine who is responsible for misconduct. It is difficult to detect fraud in text; we tend to believe photos because they cannot be changed as easily as graphs can. The 3 types of duplication are simple (which can be an honest mistake), repositioning, and deliberate alteration.
Journals are slow to respond to reports of fraud. After 5 years, no action had been taken on 65% of those that Bix had reported to editors. She recommended the use of PubPeer for submitting reports on articles. Many journals do not publish email addresses for their editorial board members, which makes it hard to send reports to them. Some journals are starting to use AI image detection systems, but AI can create fake papers or fake images, as shown here.
For a person whose first language is not English, AI can be a good tool for improving the grammar, but it cannot be used as an author!
Paper mills, another form of scientific misconduct, sell authorships or fake papers. They work with students who must write and publish articles to continue their careers. Many articles produced by paper mills have multiple authors and look very artificial. Bix called one paper mill the “Tadpole Paper Mill” because of the resemblance of blots in some images to tadpoles. (Her article contains a wealth of detailed information on paper mills.) Articles are targeted to specific journals, especially once an editor has accepted some of them.
Bix said that with many very complex articles, science is getting out of hand. Many people have lost their faith in science. Requiring transparency is a way to solve this difficulty. Publishing open science articles and requiring authors to show their raw data and its source is also a good step. She concluded her address with these thoughts and a wish list:
- Science is about discovering the truth.
- It takes a village: reviewers, journals, institutions, and funders.
- How can we distinguish fake from real? AI helps.
- The cost of scientific misconduct is tremendous.
- Use expressions of concern much more frequently than they are being used now.
- Publish contact information for editors.
- Have legal access to journals.
- As a society, we need to determine how we handle misconduct.
- We need more journals devoted to reproducibility and allow reproduced articles to be included in CVs.
Charleston Trendspotting: Forecasting the Future of Trust And Transparency
Lisa Hinchliffe, Professor, University of Illinois Urbana-Champaign; and Leah Hinds, Executive Director, Charleston Hub and Charleston Library Conference collaborated to present the sixth iteration of the Charleston Trendspotting Initiative, which is designed to offer its attendees a chance to proactively examine trends and issues facing the library and scholarly communications world.
Leah opened the session with some background, context, and the session agenda, and then Lisa defined “Futures Thinking”:
- Futures thinking offers ways of addressing, even helping to shape the future. It is not about gazing into a crystal ball.
- It illuminates the ways that policy, strategies, and actions can promote desirable futures and help prevent those we consider undesirable.
- It stimulates strategic dialog, widens our understanding of the possible, strengthens leadership, and informs decision-making.
A “Futures Wheel” small group activity was conducted with attendees seated at 8 round tables with large sheets of paper, colorful Post-It notes, and colored markers. After Lisa explained the activity, the following trends, topics, or events were chosen for discussion by the participants of the groups:
- Open is the “Destination”,
- The increasingly challenging peer review process,
- An increasing amount of content submitted to journals,
- Degrading trust in research integrity,
- AI (2 tables),
- LLM’s enabling AI authorship, and
- Increasing co-authorship over time.
The groups each selected one person to report back to the room. Attendees were favorably impressed with the session and were invited to attend the next Trendspotting session at the 2023 Charleston Conference.
Ed. note: I thank Leah Hinds for providing this summary of the Trendspotting event at the SSP conference.
The Evolving Knowledge Ecosystem
This panel discussion was moderated by Roger Schonfeld from ITHAKA S+R; panelists were executives from MIT Press, Elsevier, Frontiers, and Clarivate. Roger began by noting that we are living through a second digital transformation as digital processes take hold across our organizations, and then he guided the discussion by asking questions of the panelists. Although it is easy to get occupied with day to day processes, we need to step back and examine general publishing issues.
What is the purpose of scholarly publishing?
It is core to the academic system and provides services to the academic community, thus providing a broad contribution to knowledge in general. A core feature is validating and disseminating knowledge. We need to make all science open. Disseminating research widely will help us deal with the crises facing us, and our success will depend on the widespread sharing of knowledge, which is a role for publishers because publishers are custodians of the scholarly record.
We have seen scholarly publishing emerge as a vehicle for academic fraud and misconduct. Research publishing has become a societal vector for misinformation. Many publishers are investing more in resolving these issues.
Beyond this, what more should our sector be doing?
Misinformation is a selective multi-stakeholder problem. We need better measures to identify and block it and spend more time evaluating the information we publish. AI tools are helping by making us more productive. We need to become more transparent; for example, when an article is retracted, it is not removed from the Web of Science database, but it is marked as retracted. The Journal Impact Factor (JIF) is widely misused.
We can build structure into our metadata and have accountability and trust by requiring authors to identify exactly what they have contributed to an article. We are all our own editors, so spend time learning what you are reading and learn whether you can trust it.
Although unethical behavior will spoil the good will that we have, current measures are a step in the right direction. Publishers must set and uphold the highest standards of quality in their publications. It must be backed up by the latest open science. Sharing ideas will create connections. Why Trust Science (Princeton University Press, 2019) is a useful reference.
We all should be critical thinkers in the world around us. Where does our responsibility fall?
Human beings are often gullible; we need external mechanisms for understanding problems with information. For example, traffic lights help pedestrians safely cross the street, but they are still responsible for checking that no cars are coming.
Should publishers be thinking at a network level? What are some of the consequences of consolidation?
Any organization with less than $50 to $100 million in funding will have trouble surviving. A big concern is that the current OA model is enriching large companies at the expense of quality. We must be careful about what is happening as we make as much content open as possible. The magnitude of some of the problems is so great that no single company can cope with them completely. All publishers must provide high quality and value to their readers. Competition will prevent organizations from becoming complacent. Value determination has shifted from the journal level to the article level.
We are living through a mania with AI. What is your organization’s strategy for enabling machine to machine communication?
We still have no idea how AI will affect our jobs in the future, but it will make detection of unethical behavior even harder than it is now. We are hardly ready for this yet and are still looking at many of the related issues. AI has the potential to be transformative; it is not just glorified search.
Metadata the Musical! The Tale of the Ant and the Grasshopper
This was one of the highlights of the conference and featured music and lyrics from Aesop’s fable of the ant and the grasshopper to illustrate the significance of metadata, persistent identifiers (PIDs), and standards as well as raising awareness of key issues that metadata can or should solve, such as author name disambiguation and consistent taxonomies for language. The determined ant researcher will restore order to the chaos of research output through her re-discovery of ancient “hoo-man” information science.
One problem is the lack of metadata about human research activity (the “file drawer” problem in which scientists publish data that conveniently supports a theory and file or discard the rest, or simply do not publish negative results.). Things humans should have done about their metadata include
- Use PIDs for each entity as soon as possible in the workflow,
- Track and record connections between PIDs, and
- Make connections openly available.
What ants could teach humans about the importance of metadata:
- Collective memory,
- Chemical signatures,
- Environmental context,
- Social network analysis,
- Foraging efficiency,
- Genetic and chemical metadata, and
- Swarm intelligence.
This study highlights how different species and societies perceive and use metadata. The focus and value of metadata reflect the needs and objectives of our civilization, support our collective knowledge, and ensure the continuity of our research endeavors. Is metadata working for you? Put it to work. Publishers need to work with researchers as key stakeholders in their metadata. Poor metadata can lead to an inability to manage the publisher’s ethics.
Issues to consider:
Data is a new currency. Bad data in means bad data out. As AI tools get bigger in the data processing world, it becomes more difficult to process the data. Systems are using data and passing it through their APIs. Are you checking your data regularly? Do you know where it is coming from?
Unpacking OA Usage Reporting: What Do Stakeholders Want?
Tim Lloyd, Founder and CEO, LibLynx described how OA usage reporting differs from traditional reporting. OA content receives 7-1/2 times more usage than paywalled content. Usage comes from 189 countries; the US and China account for over 40% of it. Less than 2% of the usage is from 172 countries, the top 5 of which are Israel, Turkey, Malaysia, Belgium, and Thailand. The impacts for researchers include increasing complexity, and a greater choice of where to publish. Components of usage reporting are:
- Data capture: distributed usage, diverse formats and metadata cause complexity. There is no coalescing around standards.
- Processing: new metadata and logic. In an OA world, we usually do not know anything about the user.
- Delivery: new reporting formats, increasing frequency. Usage data in real time enables frequent reports.
The future of OA usage reporting includes scalable, granular and real time reporting, flexibility of inputs, processing, and outputs, and standards. Reliable and auditable systems and workflows are needed. We need to prepare for this future.
Tricia Miller, Marketing Manager at Annual Reviews discussed OA and the publisher perspective. Data has changed and will continue. Audiences, contexts, and purposes are growing, so OA usage data must be examined. Publishers are finding it difficult to report OA usage. Publisher considerations are: audience, collaborative framework, trust. Usage now comes from beyond academic institutions. Non-OA models are leaving out data for some non-academic institutions. The wide range of users’ interests is shown in this word cloud.
Global usage has expanded and now comes from 187 countries, so we must understand the needs of a global audience. Stakeholders now want a mission-driven framework with trust in publishers, transparency, privacy, and relationships between stakeholders.
Christina Drummond, Data Trust Program Officer, OA EBook Usage Data Trust described an NSF funded workshop on usage and impact reporting which included how to expand findable, accessible, interoperable, and reproducible (FAIR) and collective benefit, authority to control, reproducibility, and ethics (CARE) principles that are necessary for ensuring trust. Trust value propositions are grounded in economies of scale. An Industry Data Space (IDS) Framework, a networked approach to multi-party data exchange via data governance, will enhance the OA book usage data trust which has been developing governance mechanisms for the last few years. Here are trust issues to contributing data as an OA usage data provider.
Other concerns suggested in an audience poll are misleading others, leak of sensitive data to competitors, authenticity of data, and vendor protection of usage privacy.
Elliott Hibber from the Boston College Library noted that external and internal audiences want to know how often their work was downloaded, which may not be easy to answer because the data is in different places. Download count is not the best way to measure impact.
The library maintains an OA fund to provide APC funding to authors who don’t have it. The fund is also used internally for licensing resources, especially hybrid journals. Libraries value privacy, which means that usage reports cannot reveal much information about users (i.e. people).
STM Goes to Washington: Scholarly Publishers Can (or Can’t) Influence Policy
In the past three years, STM publishing and government decision-making have significantly overlapped, with issues COVID-19, the OSTP Nelson Memo, the Cancer Moonshot, and more. How decisions are reached and how scholarly publishers try to influence decisions, both before and after implementation, is an opaque process.
This session featured representatives from scholarly publishing’s government relations and public affairs teams who reviewed the possibilities, realities, and limits of advocacy work, the key individuals, agencies, and committees of the US federal government that oversee the sciences, arts, and humanities; and the policy positions and legislation affecting publishers, authors, and researchers. STM publishing and decisions have overlapped significantly, and how this happens needs to be understood. Our community voice is valuable. The hottest issue today is the issue of OSTP Guidance.
The discussion was moderated by Tom Ciavarella, Head of Public Affairs and Advocacy for North America, Frontiers, who guided the discussion by asking the panelists questions after they introduced their viewpoints.
Miriam Quintal, Managing Principal, Lewis-Burke Associates, LLC: We see an ecosystem that is challenged by policies of a divided government for the first time in a long time. There is a huge conversation about AI and the regulatory environment.
Laura Patton, Head, Government Affairs, US, Springer Nature: We must talk to the policymakers that will make decisions affecting us. Think about who has oversight. You are most effective if you are talking to supporters on both sides of an issue.
Alison Denby, Vice President, Journals, Oxford University Press: There is not one policy implementation what works for all. Think about our objectives to maintain high quality in everything we do. How can we show and demonstrate the amazing work that we do? We are superb at coming up with standards and practices. People will pay for what works and what is valuable to them. We need to show why what we do matters.
David Weindreich, Director, Policy and Government Affairs, Americas, International Association of STM Publishers: To influence policy and government discussions, you must have policy you are interested in, politics on your side, and procedure. You must know what you want: to maintain a quality system of scholarly publishing to drive researchers to discovery. Politics are individual and interpersonal. We must understand what our values are, what the people we meet with value, and what they care about to get them on our side. Procedure is important to know what is possible to influence and reach your goals.
Are there any inherent advantages for smaller presses? How do you manage that?
You need to understand where you fit in the ecosystem. If you are a large business, people will be more receptive to what you say. The scale is very different for different types of societies. Conflict between businesses and curators is complicated; we must be able to explain what we do in a slow but effective manner so it is not a shock to our journals. University presses have a tremendous influence even though they tend to be just one part of a large organization. Societies have a national reach and public value in a specific area. They must think about how to leverage their grass roots advocacy and must talk about the value of the society to early career researchers. Many times, staffers are young and do not have much experience in our issues, so it is important to have the attitude of educating them with a short sharp message. Think about how you can help the person you are meeting with and what you have to offer to them.
Will there be hearings on the OSTI regulations?
If you see that the Nelson Memo will affect you, go into Congress and speak to them about it. If they don’t hear from anyone, they will think it does not matter. There is a limit to what Congress can do or will do at this point, but the Nelson memo will not be reversed. What policies will help us in the future? You must use all your resources to affect the policy implementations and revisions. Requests for comments are good opportunities for us to raise our voices, voice concerns, and make suggestions and recommendations. People in societies have relationships with government officials, and we should talk to them. Congress is unlikely to do anything major, but they are talking about having hearings.
Do arts and humanities come up when you are talking with staffers?
The social sciences are at the front of policy making. For the first time, the National Endowment for the Humanities is making comments. Federal funding is not nearly as comprehensive for the humanities as it is for the physical sciences. Diversity, equity, and inclusion are very big subjects with government staffers.
Research Integrity #TRANSPARENCY Learning from Overcoming Mass Retractions, Systematic Manipulation, and Research Misconduct
With research output rapidly growing and an increased pressure to publish more and do it faster, no publisher or society has been spared from research integrity and publishing ethics problems. While it is tempting to hide these challenges in the shadows, we propose an open discussion of what has gone wrong and what we did to overcome it to learn how we can do better going forward. By sharing our stories and collective experiences, we can better support industry-wide efforts with a broader community of practice across publishers, societies, authors and editors. There is power in sharing our experiences even if they are very different. Much learning comes from talking about what went wrong.
Yael Fitzpatrick, Editorial Ethics Manager, Proceedings of the National Academy of Sciences (PNAS) formerly worked at Science to help detect image manipulations and is now at PNAS working on editorial ethics. What a publisher can and cannot do is different from what they should and should not do. The right thing to do is not always the easiest thing to do. The publisher may feel that an author’s institution should be informed in the case of manipulation. Reviewers are important and must be knowledgeable about ethical standards.
Luigi Longobardi, Director, Publishing Ethics & Conduct, IEEE formerly worked at the American Institute of Physics (AIP) and American Physical Society (APS). In a collaborative environment looking at peer review and failures, a shared background is an advantage. Volunteers can help with investigations, such as the “proceedings” of a conference that never happened. Sometimes papers submitted were out of the scope of the conference because the organizers did not know what was appropriate.
Michael Streeter, Director of Research Integrity and Publishing Ethics at Wiley said that we must work with a variety of stakeholders, and transparency is at the heart of what we focus on; for example, Wiley retracted more than 500 papers last year because of paper mill activity. The announcement of the volume of retractions led to how similar investigations were conducted. Researchers lose because of the time it takes to do these investigations. The Committee on Publication Ethics (COPE) has issued guidelines on how to manage special issues, which has been helpful to publishers. Research integrity is critical to the value that publishers provide. Legal colleagues help maintain ethics in advising on language of announcements, etc. Sharing information about reviewers is very important. Image authority has become a massive problem over the years.
Previews: New and Noteworthy Products
The final day of the conference began with this popular session of “lightning round” 5-minute presentations by representatives of 13 companies describing their newest and most innovative products, platforms, or content. The audience then voted on the product likely to have the most positive impact on scholarly communications.
- Aries Systems: Transforming the article production process with a new and productive model that results in a limitless production experience. Users can directly interact with the XML. The LiXuid tool set can be an end to end production system or the user can choose the desired components of the set. Automatic XML text conversion and a content editing interface are available.
- Cambridge University Press & Assessment: Traditional processes do not reflect the research lifecycle. Using Cambridge’s Research Directions process, each journal would tackle a fundamental question in the field, with researchers publishing incremental results and analyses that contribute to answering the basic question. They can then refer to the work of others and share ongoing work, thus increasing collaboration and knowledge sharing. Twelve titles produced this way will be available by 2025.
- Copyright Clearance Center (CCC): Interoperability is perfected by relying on APIs to ensure that data is moving and exposing itself at the right place in the life cycle. Exposing support during the submission process is not good enough. OA decisions should be made at the time of manuscript submission to the publisher and CCC to check for agreements between the publisher and author’s institution. Publishers tend to wait for vendors to take advantage of APIs. It is better to think about where authors are publishing.
- Delta Think: A client wanted to determine the demographic makeup of their authors and contributors and used Delta Think to do the data analysis, form goals and policies for reporting and sharing the data, and then did data cleanup to remove duplicates and performed analysis. A quick automated solution has great value and provides the ability to visualize results, make comparisons, and report to entire community. The full case study is available.
- Humanities Commons/MSU: Federated open commons infrastructure. High barriers to participation have been created because there is too much power concentrated in too few hands, leading to a growing move towards decentralization. Humanities Commons is an academy-owned and governed project designed to serve the needs of scholars, writers, researchers, and students as they engage in teaching and research projects and decide on customizations.
- Modio Information Group: A common problem is that researchers do not have enough time to read all the information relevant to them. Modio combines the superior substance of text with the convenience of audio, which can increase the value of the content. It is not a podcast, but more like an audiobook. Text converted to audio is the fastest growing segment of publishing. Students can read and narrate the content which is optimized for multi-tasking consumption.
- Molecular Connections Pvt. Ltd. is a one-stop shop for relevant content. Content formats are increasing, and there is too much content available. OrthoSearch was developed as a unified portal for all orthopedic content in the UK. Updated daily, it contains about 400,000 records and includes a relevancy checker and interdisciplinary journals, all in a single place. Synonyms have been unified, it is FAIR compliant, and altmetric scores are included. Natural language search is also available. 29,000 users accessed the system in its first year.
- Morressier has built a 360 degree research integrity suite with the ability to flag fraudulent behavior detected from patterns. Industry pressures: publish more OA, avoid retractions, and publish faster. Barriers: multiple vendors, varying types of invoices etc., so detection is being done manually on legacy workflows. Scaling too much may result in lower quality. The solution was to create an API and an integrity dashboard.
- Virtusales: BiblioSuite was built to support publishers as they create content by sharing technology, supporting all formats, and managing content and contract data. It can be used by 2 or 2,000 users and makes publishers more productive, achieves efficiencies, develops cost estimates for acquisitions etc., and manages the entire process.
- Xpublisher is an intuitive online XML editor using a Word-like interface to publish content automatically. Users can create custom schemas, use XML to run them, produce a taxonomy, configure SaaS, enforce best practices, and access content from anywhere for efficient collaboration.
Morressier’s integrity suite was voted by the audience to be the best product of those presented.
Working Together to Preserve the Integrity of the Scholarly Record in a Transparent and Trustworthy Way
With the proliferation of scholarly content, how do researchers know what to trust? There is no single source of truth, and many in the community are looking for support and answers. This session brought together different parts of the community to discuss how their organizations are working together to tackle this challenge, and asked the audience to highlight any remaining problems that we need to solve together.
Amanda Bartel, Head, Member Experience, Crossref, said that many organizations that you might not think of as scholarly publications use Crossref. Its Research Nexus Vision captures relationships between players to preserve integrity of the scholarly record. It takes a village of collaborators to make sure that the data collected is appropriate; Crossref’s role is to enrich metadata to provide more and better trust signals and enable an inclusive scholarly record. Funders are key; they care about open and traceable works and especially about retractions and corrections. Publishers can include data on specific funders they work with. Crossref members can help by starting to add context.
Nandita Quaderi, Editor-in-Chief and VP, Editorial, Web of Science, Clarivate noted that when a measure becomes a target, it is no longer a good measure, and it can have unintended consequences. Misuse of bibliometrics in research assessment is driving fraudulent behavior; they should never be used as a replacement for peer review because research integrity will be compromised and the scholarly record will be polluted. The Web of Science has rigorous requirements for journals to enter its collection. Those selected are subject to periodic re-evaluation, and if they no longer meet quality criteria, they are delisted. Publishers are not penalized for being transparent or retracting articles because retractions are a healthy way to keep the scholarly record clean. As fraudulent behavior increases, aggregators must become more proactive to detect fraudulent journals. A new AI tool helps identify which journals should be evaluated.
Clarivate’s Master Journal List is freely available and searchable; it lists journals that have been delisted, accepted, changed their title, etc. An upcoming change in the Journal Citation Report will occur when the Journal Impact Factor is expanded to all journals in the Web of Science Core Collection, including those in the arts and humanities. The Journal Citation Report will then be an indicator of journals that can be trusted.
Hylke Koers, Chief Information Officer, STM Solutions described the STM Integrity Hub, which has a mission to safeguard the integrity of science. Collaboration is a key to research integrity and gives us the ability to look at problems holistically. About 75 people in 25 organizations are actively collaborating. The 3 pillars of the Hub are knowledge and intelligence, policies and frameworks, and enabling infrastructure. It is meant to be a tool for decision support. A recently launched paper mill detection tool will allow moving from detection to prevention. The Hub provides alerts for editors to further investigate, but it does not make yes/no decisions.
Patrick Franzen, Director, Publications and Platform, SPIE is an elected member of the COPE Council. The Committee on Publication Ethics (COPE) has 40 members globally and takes a broad view of issues attacking the industry. Its 3 core principles are support, leadership, and voice; it tries to be neutral and objective in providing impartial feedback to publishers. Recently, membership eligibility has been expanded to universities that conduct and publish original research. COPE provides training courses on publication ethics, links research integrity and publication ethics, and identifies the common ground. A culture of publication integrity has been created together. How do we take these steps forward?
There is a villain in our village; who does the panel think it is?
There aren’t any winners, but the Journal Impact Factor might be. Scholarly publication is a system of trust. Paper mills etc. will find ways to get around the trust.
It’s important for us to understand and differentiate all the layers in the publication process such as the misuse of bibliometric factors like the Impact Factor, H Index, etc. Villains unintentionally misuse these things.
Our incentives are ripe for abuse. We are incredibly silent; publishers do not necessarily talk about their problems, but hope they go away. We have allowed problems to happen.
One of the problems is the stigma association with retractions. Authors need to be willing to admit their intentions and not be penalized for them.
There is tension between peer reviewed articles and preprints.
What do these issues do to the reputation of the scholarly community? How do publishers address that view and value the community’s reputation. We cannot hide problems but must fix problems by being transparent, own the problems, and explain how things work by explaining what a retraction is, etc. Our audience has changed but we continue to act as if all sources are equal.
Not every article is totally correct because that is how science works. Our duty is to serve our communities and publish objective scientific research.
Licensing Privacy: What Librarians Want
Librarians are increasingly concerned about the ways in which users of library licensed resources are being tracked by the third-party providers of these resources. Librarians question if they can offer users assurance of privacy in any meaningful way when using library licensed resources. For librarians, this is deeply troubling because of our long-standing commitment to user privacy and confidentiality. The ALA Code of Ethics states “We protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted”. The Licensing Privacy Project, funded by the Mellon Foundation, has developed a set of resources to help librarians understand these issues and supports librarians and providers working together to better align platform and publisher practices with library values of privacy, confidentiality, and respect for user control over their own data.
Lisa Janicke Hinchliffe, Professor, University of Illinois at Urbana-Champaign, introduced this session: How can libraries and providers ensure the privacy they are looking for? How do we do the work we need to do and demonstrate value at the same time? Many activities have attempted to address this complex question. What future do we want? Are we satisfied with our direction of travel?
Cody Hanson, Director of Information Technology, University of Minnesota Libraries, said that when we told library users that their data was private, we were not telling the truth. It might be dangerous to tell someone that something is private when it is not. Seminal documents include the ALA Library Bill of Rights and ALA Code of Ethics. Free people read freely, which is why we are concerned about privacy; we want users to have the freedom to read and think freely. ALA always combines privacy with confidentiality. There is a difference between privacy and confidentiality: Confidentiality is the library’s responsibility, and they must ensure that any third party will keep that confidentiality. We need to reconcile the demands of the digital environment with the right to privacy. User data must only be used to provide or improve services.
Library privacy guidelines also apply to vendors. Third parties need data to provide their services. A licensing policy emerged from the National Forum on Web Privacy and Analytics. Privacy will not take precedence over access. Many services use personal accounts and not public ones. We need to be specific about what we want. Libraries must develop their own policies internally. Here are some common tactics they can use.
- Resist vendors’ non-disclosure agreements (NDAs) because libraries need to tell users what they are doing.
- Clarify what terms are being consented to.
- Assert who can consent to the terms.
- Clarify when and how user data can be shared.
- Require notification in the event of a data breach.
Assessing a vendor contract requires many documents because it is a big project.
Some emerging insights:
- Model language is useful, but it is not sufficient to ensure user privacy.
- Libraries must ensure that all employees have a basic understanding of authentication and authorization policies.
- Libraries should identify a Library Privacy Officer to provide leadership in protecting user privacy and information security.
- Libraries should develop campus partnerships and work collaboratively.
Athena Hoeppner, University of Central Florida Discovery Services Librarian, said that users have no idea what information about them is out there and how to find that out. Keeping data private and secure is important to a university that has policies about what data can be shared.
Cody Hanson, Director, Information Technology, University of Minnesota Libraries, noted that the challenge for us is to be able to provide resources needed to demonstrate to our partners what is authorized. How can we best manage the information? User identity does not need to travel throughout the whole system.
Amanda Ferrante, Principal Product Manager for Identity and Access Management, EBSCO Information Services, said that we must make sure that the needs and requirements of our customers are met. Authenticity and authorization are closely coupled to privacy. Users want to know that EBSCO complies with laws. We need to have conversations about privacy: what information is selected, who it is shared with, who has control of itn and how is it exercised. We need to give customers the ability to control their data while honoring the values of our community that values privacy so highly. It is often very difficult for vendors and users to have a conversation about privacy.
The closing keynote was a debate moderated by Rick Anderson, University Librarian, Brigham Young University. Debaters were Tim Vines, Founder and CEO, DataSeer, and Jessica Miles, Vice President, Strategy and Investments, Holtzbrinck Publishing Group. The topic of the debate was:
Resolved: Artificial Intelligence Will Fatally Undermine the Integrity of Scholarly Publishing
Before the debate, the audience vote on the resolution was 77% agree and 23% disagree. After the debate, the same vote was taken, and the debate winner was the person who convinced the most attendees to change their vote.
Tim Vines argued 3 reasons to be in favor of the resolution:
- It will soon become almost impossible to distinguish between articles by humans and AI. Humans are good at distinguishing faces, etc., but we have no such history on text, so we must rely on technology. Text is simpler so it is easy for AI to get it right.
- The task of distinguishing will fall to editorial offices. ORCID is now 10 years old and most journals still do not require authors to have IDs.
- Requiring authors to provide their data sets will be a significant difficulty for them.
Open science wil not save us from the corrosive effects of AI generated research. Much of the industry prefers not to ask awkward questions. Every author is a publisher’s paying customer, so publishers do not want to spend effort finding fake articles. If the scholarly record is contaminated by many fake articles, how can authors do research that builds on them? Is scholarly publishing willing to do whatever it takes to find the truth?
Jessica Miles opposed the resolution, noting that scholarly publishing must constantly manage coping with technological innovations. Trust and transparency have been critical for a long time, and they will continue to be so in the age of AI. Scholarly communities have established trust using technologies such as DOIs, and they have invested heavily in technologies to find articles. Such communities provide a blueprint for how the industry will respond to technologies like AI.
People, not technology, can work collaboratively. Automation has been instrumental in developing new processes like typesetting to improve publication. AI is ubiquitous in scholarly publishing; LLMs are a type of AI. We should not lose sight of earlier forms of AI like machine learning. Elsevier and the American Chemical Society (ACS) have integrated AI tools into their processes to make publications more usable. We should learn from the past. There is a risk that small publishers will be excluded from these changes because of the cost of implementing the technologies.
- We must preserve trust by putting humans at the center of everything we do.
- We must ensure transparency of production in using AI, how it is employed, and how we make decisions.
- We must improve the data which AI systems are using: garbage in, garbage out. High quality training data must be used.
Tim: We are deploying the approaches we used to tackle yesterday’s problems. There are good uses for AI, which does not mean there are bad uses for it. We have not been successful in tackling the problems, and we are just waking up to the magnitude of them. Journals that do not want to certify their results as real are becoming repositories of junk. Scholarly publication is communication; if the sender changes what they are sending, and the reader fails to understand it, communication fails. A new type of reader will be able to understand the communication as well as a machine can. Can we spot the bad actors? AI will change things in many ways. Most things written are by people, but in the future they will be by machines. In 25 years, doing science will be profoundly different; many labs will have their own in-house AI machines.
Jessica: Several of Tim’s statements are problematic. He equates concerns from editors of journals in widely differing subject areas. Publishers have significant incentives to detect fraudulent research quickly. People, not technology, fuel detection of research fraud. Transparency begets trust, with trust leading to transformation. Not all bricks in a brick wall will stand the test of time. We now have the ability to train models to do specific things.
- Peer review is also technology. Maybe something will be developed that is more adversarial.
- We have done a lot of work, but it has not been properly implemented.
- Are we being shortsighted not talking about transformation? Are there other ways that AI could be used?
- If we see a proliferation of fake science and fake articles, will that not make the record of science infinitely more valuable because it has been examined by an editorial board, reviewers, etc.?
- AI offers a large increase in productivity to the scientific domain. Is there a future where peer review is just one aspect of this process? AI can enhance our ability as humans to accelerate reproducibility checks and detect fraud.
The final audience vote was 68% disagree and 32% agree, so the winner was Tim.
For Further Reading
Leah Hinds posted daily summaries of some of the conference sessions she attended on the Charleston Hub which you can read by following these links:
The 2024 SSP conference will be in Boston, MA on May 29-31.
Donald T. Hawkins is a conference blogger and information industry freelance writer. He blogs and writes about conferences for Information Today, Inc. (ITI) and The Charleston Information Group, LLC (publisher of Against The Grain). He maintains the Conference Calendar on the ITI website). He contributed a chapter to the book Special Libraries: A Survival Guide (ABC-Clio, 2013) and is the Editor of Personal Archiving: Preserving Our Digital Heritage (Information Today, 2013) and Co-Editor of Public Knowledge: Access and Benefits (Information Today, 2016). He holds a Ph.D. degree from the University of California, Berkeley and has worked in the online information industry for over 50 years.