By Donald T. Hawkins, Freelance Conference Blogger and Editor
The National Information Standards Organization (NISO) convened its second NISOPlus conference on February 22-25, 2021 with the theme “Global Connections and Global Conversations”. It attracted 835 attendees—4 times more than attendance at last year’s meeting. The organizers wanted to have an event at which attendees could discuss and share their issues with others who had similar ones, not only at the conference but afterwards. Because one of the advantages of a virtual conference is that participants do not need to travel to attend it, the audience was global. The map below shows that NISO succeeded admirably in its aims.
The schedule was an experiment and differed from that of a traditional conference because of the global audience from 26 countries. Daily sessions began in mid-morning Eastern Time and continued through mid-afternoon. Then there was a break until early evening, when sessions resumed and continued until between 10 and 11 PM (midday in Asia and Australia) which accommodated presentations by speakers from that part of the world.
Every session had time for discussion, and many of them were recorded in advance so that they could be accompanied by a transcript—a nice touch.
Well-known science fiction novelist Cory Doctorow began his keynote address by noting that big technology controls our lives, which is a problem because unchecked power leads to bad outcomes. Our lives are better when we exercise self-determination. Companies control our lives, but they should not spy on us or manipulate us. Facebook and similar companies are conducting non-consensual experiments on millions of people. Taken at face value, such actions are extremely alarming and self-serving; they are leading to monopolies which subvert evidence-based practices. Digital technology monopolies are not like those we faced in the past because they have additional technology available to them. Competitive capability is at the heart of all technology. Monopolies distort public policy. Only a small number of people are now controlling our digital lives. Doctorow concluded by warning us that monopolies are bad because of democratic harms and because they are taking away our free will even though many people are thriving under the status quo.
Laurent Le Meur, CTO, European Digital Reading (EDR) Lab, and Marisa DeMeglio, Software Developer, DAISY Consortium, discussed strategies for ensuring that e-book accessibility is done well. Many users still face challenges when they simply want to acquire an e-book. Accessibility is the guiding principle for e-book work, and soon it will be the only way we will be able to get e-books and other content. Only about 10% of the world production of e-books is accessible, and publishers are only now starting to create accessible e-books.
The Web Content Accessibility Guidelines (WCAG), developed by the World Wide Web Consortium (W3C), explain how to make web content more accessible to people with disabilities. About 30% of the guidelines can be checked automatically, and the remainder must be checked manually by experts. ACE, a free open source app developed by the DAISY Consortium, is an automated accessibility checker for e-books that can be integrated into a production workflow or used independently. The SMART system guides the user in the manual checking process.
Publishers must certify that their products are fully accessible, which is a burden because it is costly. Production workflows are certified rather than individual e-books. The whole process must be made accessible—from production to accessing to purchasing. Booksellers and public libraries are asked to make their websites accessible.
Software systems to read e-books aloud are being developed. The Thorium Reader is a free app from the EDR Lab that just became available. It runs on Windows 10, Mac, and Linux PCs and will have a facility to generate accessible books.
Research Data: Describing, Sharing, Protecting, Saving
Standardizing and aligning journal and funder data policies
Iain Hrynaszkiewicz, Director, Open Research Solutions at PLoS, and Natasha Simons, Associate Director, Data and Services, Australian Research Data Commons, discussed policies to incentivize data sharing by researchers when data sharing became mandatory at PLoS and BioMedCentral, in a large increase in sharing resulted. The research data policy landscape is evolving; more funding agencies (approximately 22% of them) now require data sharing, which publishers and journals are obliged to support. The unintended consequences of these policies were new initiatives from many publishers, multiple similar policies and terminologies, different levels of support, and confusion among researchers. The Research Data Alliance (RDA) was therefore formed to help engage stakeholders, and it developed 14 principles to encourage data sharing and increase the adoption of standardized research data policies. Although stronger policies increase data sharing, they require more resources; sharing is an investment in the future. It is being used by some Springer Nature and all PLoS journals. Some questions for NISO to consider:
- What problems would be solved by creating formal standards for policies?
- Where would alignment of funders and policies have the greatest impact on data sharing?
- Where is more work needed to build out policy features or scope?
The Mystery of the Data and Software Citations…Why They Don’t Link to our Papers and Credit the Authors
Shelley Stall, Sr. Director, Data Leadership, American Geophysical Union (AGU), said that developing a map of citation linkages is done by building on the work of other and requires much effort. Many of the primary linkages come from repositories, not papers. The goal for journals is that all citations and software references be machine readable.
Patricia Feeney, Head of Metadata at Crossref, illustrated how citations can become distributed in Crossref: members deposit metadata, including data or software citations, and Crossref matches the data citations with DOIs, and then the citations are available via APIs. But these steps are difficult: members must collect metadata as part of their workflows and understand the best route for citations; identifying whether a citation is a reference to data or software is difficult; and matching citations to DOIs works well for journals but not so well for data. It is unlikely that data or software has a Crossref DOI because many publishers do not collect data DOIs. Changes are being made to support identifiers beyond DOIs by adding data and software as citation types.
Day 2 Keynote: Connecting the World Through Local Indigenous Knowledge
In her the second day keynote address Margaret Sraku-Lartey, Principal Librarian, CSIR-Forestry Research Institute of Ghana, said that COVID-19 has virtually brought the world to its knees by affecting every sector of the economy and spawning digital behavior changes. New trends like remote working and telemedicine are not likely to disappear any time soon. There has been a tremendous improvement in how information is created, transmitted, and used. Researchers build on published literature, which is made possible through their collaboration with publishers and information managers so people to support the knowledge sharing process are always needed. Such people have always been concerned with explicit knowledge, but little attention has been paid to tacit knowledge. Indigenous knowledge (IK) is that which has transferred orally and spans several generations; today we risk losing the knowledge stored in people’s memories; when people die, nobody can benefit from it. According to UNESCO, IK is a basic component of a country’s knowledge system and forms the basis for local-level decision making.
Applications of IK include:
- Health. There is a need to find new cures for diseases, especially viruses that can cause epidemics, such as SARS, Ebola, and COVID-19. One of the best ways to find such cures is to talk to local people. Forest plants and products derived from them are used to treat various ailments, and many remedies based on them have been used for generations and have become generally accepted as viable treatments; for example, IK led to the concept of vaccinations, and in Ghana the “miracle berry” is used to sweeten sour foods. It has been proposed as a treatment for taste changes experienced by people undergoing chemotherapy treatments for cancer and also for people with diabetes.
- “Sacred groves” are small patches of the original habitats or forests of various dimensions. They are treasure troves of knowledge that have cultural, historical, and scientific benefits and provide valuable medicinal plants and herbs that can serve as a refuge for threatened species. Sacred groves are common in many developing countries but their impact may be diminishing in some places. Some of their benefits are:
- Preserving local water supplies,
- Conserving rare and threatened plants,
- Creating cool microclimates,
- Conserving nature for the benefit of society, and
- Managing local resources.
- Living libraries. Many local libraries do not have books; instead people substitute for books. Living libraries have a distinct way of storing and retrieving knowledge and information. People have history in their memory and can therefore be treated as libraries and custodians of knowledge. Many of them have knowledge that is centuries old, so the people can be regarded as librarians.
IK is thus an important source of developmental information, and it is imperative for all information personnel to begin to learn proactively about IK and meet the needs of local populations. It is important to learn that a comfort zone may be a beautiful place, but nothing grows there, so we must be able to get involved, which happened at the peak of the COVID pandemic when some librarians began making face masks. We must recognize local people as contributors of knowledge.
To move forward, each player must be recognized as an equal partner, and there must be mutual respect for the knowledge and collaboration between them. Let us come together now and start a global conversation on IK to create global connections.
Innovative Forms of Scholarly Publishing
In this session 3 panelists were asked to respond to questions from the moderator, Robert Boissy, Director, Account Development, Springer Nature. Panelists were Stephanie Dawson, CEO, ScienceOpen; Sara Cohen, Sr. Acquiring Editor, University of Michigan Press; and Kath Burton, Portfolio Development, Humanities, Routledge Taylor & Francis.
What are some significant projects in scholarly publishing?
- Bridging the divide between theory and practice by collecting books that show the breadth of subjects being published and the impact of the work. Recognize the need for publishing to accommodate new types of research and represent the evolution of digital scholarship.
- Hip hop as scholarship has never been published before. But a book is in progress, and there is a need to work with the author to decide what to publish and how to handle peer review. The Fulcrum platform allowed us to publish digitally and see the lyrics as the book evolved.
- Can we speed up the research cycle by sharing work in progress? Transparent workflows are based on an open review of preprints in which editors decide whether an article is appropriate for the journal and it is then published OA. After at least 2 open reviews have been published, the editor decides whether to revise the article or accept it for publication to the website. Thus, early results can be made available during the review process. Preprints are here to stay as a means of getting research out faster. ScienceOpen is now working on related preprint processes.
The publishing process is moving toward openness. How is that working?
- A strong commitment to accessibility across the publishing program at the University of Michigan is also strongly committed to diversity, equity and inclusion. The more readers we can reach the better it is for authors. OA is not occurring across all disciplines; the big challenge is finding funding to make works available OA.
- “Open” once applied at the article level, but now it is applied across the whole publishing process. Real collaboration requires sharing. Authors are comfortable with some parts of the OA process but not others; in the end they just want to publish their articles. Libraries need to think of new and attractive ways to showcase OA.
- Collaboration is particularly valuable to open up the elements of research, view open generally, and give authors insights into new forms of knowledge. Workflows need to be adapted to accommodate more OA processes.
What about the structure of publishing itself: what deserves to endure and what walls can come down?
- An essential part of the work that academic publishers do is to validate peer review. The walls of the black box of peer review can come down. Peer review is to identify and reward excellence. We need to see more innovation in business models and reward excellence in novel ways.
- Humans must endure a more digital publishing system. We can take advantage of technologies, but we need people to work them out. Barriers to collaboration can come down.
- Humans vet projects in the early state. Curation and vetting add value to publishing programs. There is a lot of room to think about what peer review looks like, such as rethinking who counts as an expert in peer review. Scholarship should move away from text as the primary medium and publish works that do not look like a monograph.
FAIR Data Principles and Why They Matter
Brian Cody, CEO and Co-Founder, Scholastica presented some background on FAIR (Findable, Accessible, Interoperable, Reusable) data and findings from a report on journal production and access by society and university publishers that was issued in 2020. The data were taken from a survey asking about article formatting, layout processes and procedures, metadata tagging standards, OA journal development, and funding approaches.
Although FAIR can be thought of as low-hanging fruit for scholarly publishers, there can be significant problems with it, such as:
- Data are not findable because the metadata does not point to the data set,
- Variable names in the data are not familiar,
- Data are not in a usable format or are proprietary,
- Author name variations, and
- Few ORCID IDs.
In an UnFAIR world, research is hard to do and its pace is slow. Being FAIR is not easy and takes time and patience to do well. Brian recommended that 5% of a research budget should be spent on a time management plan for FAIR data.
Many authors do not know anything about FAIR, so information about it should be put into a journal’s author instructions. Authors should be prepared to include a data availability statement in their manuscripts, and publishers should require authors to put their data in a FAIR repository. Metadata should be FAIR and cited correctly using the Journal Article Tag Suite (JATS) format. ORCID IDs for all authors should be verified with the JATS4R (JATS For Reuse) validation tool.
Stephen Howe, Senior Product Manager, Copyright Clearance Center (CCC) discussed leveraging FAIR data principles to construct the CCC Author Graph. For example, the huge increase of articles about COVID is well known, and it has highlighted the need to find suitable peer reviewers quickly, so a visualization of the authors of the articles—the CCC Author Graph—was built.
Using FAIR data + analytics gives us the knowledge to produce these graphs and transform data into actionable information. To reliably extract the authors from the data, we rely on if being FAIR. However the realities are that data quality issues such as those shown below are numerous, common, and present obstacles to extracting knowledge from it. The rush to publish in 2020 has resulted in many more errors.
Paul Stokes, Product Manager, JISC said that although most people think that FAIR is a good thing, it is not good enough. Just because something has a simple snappy acronym does not mean that it is right. Unless flaws are addressed, the concept of FAIR is doomed. Some things that are wrong with FAIR are:
- Costs are cumulative and ongoing, and cheaper storage leads to greater data use which leads to higher costs.
- Where’s the value? Do you know what your data is worth?
- Is the data reliable?
Perhaps we need a better name for FAIR: FAIRER = FAIR + Economically viable and Reliable, or FAIREST = FAIR + Environmentally friendly, Sustainable, and Trustworthy. But we do not know how to produce FAIREST data!
What Has 2020 Taught Us About the Information Ecosystem?
By any measure, 2020 was a most unusual year. This session featured 8 panelists discussing what they have learned. The speakers were librarians, publishers, and vendors from widely dispersed areas of the world; each gave a brief presentation in response to several questions. The result was a highly informative view of changes in a variety of places and fields.
C. K. Ramaiah, Professor, University of Pondicherry, India
People became more aware of the need for cleanliness and looked out for others, cooperating more with the authorities. Occurrences of loneliness and depression, particularly among students, increased. University budgets were cut, but there were no job losses. Activities like conferences and seminars were moved online, and the campus was closed. Fortunately, health care was maintained despite shortages of medical equipment. Faculty, staff, and students cooperated extensively with the WHO guidelines. Training programs and webinars were conducted online, and they are continuing. Many scholars started publishing their preliminary papers, and the number of downloads and virtual uses of the library has increased tremendously. Preprint sharing is here to stay.
Stephanie Dawson, CEO, ScienceOpen
Faster processes will be a significant result from 2020. There was a global effort to understand the science of the virus and to develop new vaccines and treatments in record time. Many researchers posted their results as OA preprints on systems like BioArXiv and MedArXiv. The increased usage suggests that researchers had positive experiences, which indicates that posting results to a preprint server will become normal, at least in the biomedical sciences. Publishers will find ways to offer preprint services to this community.
Sandy Hirsh, Associate Dean for Academics, San Jose State University
Higher education has undergone massive changes in the way that courses are delivered and faculty and staff work because of the pandemic. Higher education will probably return to more face-to-face interaction in 2021, but remote work will still be allowed and will continue. However, it is unlikely that we will return completely to pre-pandemic practices. For example, formerly there was a concern about remote work and monitoring whether people were actually doing their jobs. Some people love the convenience of working from home, but others miss the interaction with colleagues. The sudden shift to remote work was a major adjustment for both faculty and students; many students do not have facilities at home to work remotely.
Offering a mix of hybrid courses will continue for a long time and will be a major shift in how universities conduct education, which will become more student-centered. Some courses require in-person activities, such as labs, etc. Economic uncertainty and lack of resources are significant challenges. The pandemic has forced higher education to try new ways of teaching, learning, and working that probably would not have happened at all, or would have taken a much longer time to happen.
A survey of 31 deans, directors, and program chairs who are members of the Association for Library and Information Science Education (ALISE) confirmed some of the points made in this presentation: areas that decreased because of COVID included fewer international students enrolled, study abroad programs, and course offerings. Areas needing reconfiguration were student facilities for children, effective communication between faculty and students, and camaraderie among faculty. Positive results of the COVID pandemic included streamlining operations, increased structure in faculty and staff meetings, use of Zoom to eliminate long commute times for faculty, moving services online, more online offerings and hybrid courses, and use of social media to stay connected with students.
Oya Rieger, Sr. Strategist, Ithaka S+R
2020 has taught us about the fragility and resilience of our information system and has had a major impact on preprints. The pandemic caused more urgency in research on vaccines and treatments for COVID, so researchers began sharing their results in more preliminary phases, which has raised concerns about publicizing unvetted research. Some journalists began reporting on preprints without indicating that they were about work in progress, and the level of attention to preprints has increased awareness of their virtue but also the risk of using the results before peer review has been completed. We can expect these effects to continue.
Mohamed Ba-Essa, Manager, Preservation and Digital Services, King Abdullah University of Science and Technology (KAUST)
Established in 2009, KAUST is a new university, so most of its library collections and services are online. In 2017, critical processes of library services were identified, so when the pandemic occurred, there was less impact on the users. All resources are available online to users with a single sign-on. A virtual training class was launched in March, and it was very successful. Everything is recorded and can be accessed as desired. A focus on well-being was appreciated. Dashboards for library services were developed and used to monitor the status of library services in real time which was very helpful to the staff. Finally, electronic records management and archives were also impacted and all physical records were digitized. So the effects of the pandemic were mainly positive, and the only area that was greatly missed was the human interaction.
Clarissa West-White, Reference Instructor and Research Librarian, Bethune Cookman University
A primary concern was access to computers and the internet by students. Many students live in countries that are affected by the digital divide, so the initial shutdown was difficult for them. The library extended the loan period, canceled fees, and provided information on places with free Wi-Fi. When the supply of laptops was not able to meet demands, more laptops were purchased by the university. Libraries have been digital for a long time, so the transition to remote reference and support were virtually seamless. But not all librarians were proficient in platforms such as Zoom’s capability to schedule meetings, invite participants, etc. Interacting with students remotely and teaching without using body language and facial expressions to determine comprehension was difficult.
Peter Simon, VP, Product Management, NewsBank
Our organization has become more focused on priorities for meeting the needs of various user groups, particularly college and university libraries and K-12 schools. In many cases, the transition was not so much a change but an acceleration of what was there all along. We need to focus on all of the virtual modes of delivering information, especially for learning environments and services to public libraries. Operations, meetings, and processes were already geared to being virtual at least some of the time, so our interaction with users, librarians, and other organizations has worked well and helped us to succeed.
Christopher Chan, Deputy University Librarian, Hong Kong Baptist University
2020 was overshadowed by events around the world, but in Hong Kong, we were also dealing with the effects of the protests that occurred in 2019. Ultimately the protest movement led to the imposition of a national security law on Hong Kong, which included the banning of books. The biggest fear now is the potential chilling effects such as thinking about what to say in professional settings; self-censorship is a concern which has brought home the fragility of the freedom that we have in our information environment. What happens when our ideas of global cooperation and collaboration such as we are having at NISOPlus run into political realities? How can we help each other?
The pandemic has been very disruptive, but it has been impressive how resilient our colleagues have been in adapting to online teaching and learning and providing access to information resources virtually. The pandemic has also been a driver of progress: we have implemented OpenAthens to make access as smooth as possible for our users, which would likely not have happened in normal circumstances.
Preservation of New Media: Roles and Responsibilities
Heather Staines, an Independent Consultant, and Mark Graham, Director of the Wayback Machine, Internet Archive, responded to a series of questions on this important topic.
What got you interested in digital preservation?
- Digital preservation is not looking back, it is looking forward. It makes the web more useful and reliable.
Digital preservation has been underway for some time now. Aren’t we done yet?
- People are looking for new ways to raise awareness of it. What can librarians preserve from their collections? Who is responsible for making preservation decisions? Preservation of local content and coverage gaps are the concerns of many librarians. We have just begun and there is very much more to do in helping make information more useful and helpful to people. For example, at the Internet Archive, many broken links in Wikimedia were discovered and corrected with a link to the archived version. Many newspapers have no formal digitization preservation practice in place. Digitization opens opportunities that are not available with analog content, such as word frequency analyses, meta analyses across papers and journals, etc. If we do not do this, we lose our ability to remember.
Who is responsible for preservation and how is that shifting?
- Publishers have been responsible from the outset. Now, vanishing journals are OA, and we see fewer preservation efforts among publishers. We need to make sure that the scholarly record remains intact. There is a rich variety of projects to preserve. Managers of preprint servers are still trying to find out if there is a preprint business model or if there should be one.
What are some challenges and what are you most excited about?
- The Internet Archive has a catalog of over 15 million full text published papers in the Wayback Machine. We have an exciting opportunity to work collaboratively with people who are passionate about preservation. Content is becoming dynamic and interactive. Authors can bring a lot of creativity to bear; how can we preserve that? Some content is not just in a single place, so we must preserve not only the content but also the connections between it. We also have supplemental data such as annotations, code, videos, etc. Interest in digital preservation is expanding worldwide.
What should attendees of this session take away?
- If you see it, save it. Web.archive.org is available to share and archive things from the public web. People should just try things, ask for help, and be of service to others. See articles from the NASIG Digital Preservation Committee website as well as the CLOCKSS and Portico websites.
Controlled Digital Lending (CDL) and New Models of Sharing
In her introduction to this session, moderator Oya Rieger, Senior Strategist, Ithaka S+R, noted that outdated copyright laws in the US are causing serious impediments to information distribution. They are a frequent topic at conferences, and the COVID pandemic has highlighted these challenges. Thus, CDL is being examined more closely.
Kyle Courtney, Copyright Advisor, Office of Scholarly Communication, Harvard University Library, presented an excellent tutorial review of US copyright law, which is established in the Constitution. The creator of a work has an economic monopoly on the work for life plus 70 years to reproduce the work, create derivative works, distribute copies of the work, and perform or display it publicly. Libraries serve the economic purpose of copyright by spending large amounts of money on building and circulating their collections. They have statutory exceptions to exercise some of the rights of the author without obtaining permission or paying a license fee. They can legally own and loan books because of Fair Use and First Sale exceptions. Fair Use means that teaching, scholarship, and research uses are not infringements of the author’s rights. First Sale means that the owner of a copy may loan or sell it without paying the author, and the author may not prevent subsequent sales. Entire industries depend on the Fair Use and First Sale exceptions to conduct their businesses: libraries may lend books; used bookstores sell copyrighted works; and eBay users may sell copyrighted works they own. The movement to digital books has interfered with many library rights because they are sold based on licenses (contracts) that can change terms of the copyright law and affect a library’s operations.
The pandemic greatly impacted libraries; when they suddenly closed, printed books were trapped in libraries and had no value. It has been estimated that as many as 650 million books were on library shelves with no access to interlibrary loan, reserves, document delivery, or even reading books aloud in story times. Libraries did not get licenses to circulate e-books they had purchased because they were not required to do so under copyright law. CDL has emerged as a solution to these problems. It uses technology to replicate a library’s rights to loan digital works under controlled conditions. The number of digital copies in circulation at one time is limited to the number of physical copies the library owns. CDL is a combination of First Sale, Fair Use, and technology, and is limited to books that are owned and not licensed. If a library digitizes a work for circulation under CDL, the physical copy is not available for circulation, as this illustration shows.
Over 80 libraries have used CDL to loan digital copies of their owned works. When the pandemic occurred, Hathi Trust provided Emergency Temporary Access Services (ETAS) to its customers. One advantage of CDL is that loan periods can be very short—as little as 1 hour—and once a user is finished with the book, it can be available instantly to the next requester: there is no waiting for the book to be physically returned to the library. Less restrictive licensing and technology, such as selling books to libraries without a license, would allow them to more easily maintain their vital functions in society.
Carlo Scollo Lavizzari, an intellectual property lawyer at Lenz Caemmerer in Switzerland, compared print lending and electronic lending. The concept of lending comes from the print world, so access and licensing models for e-books are generally compared to lending printed books, although there is often confusion between lending and document delivery services. E-lending is not lending in the traditional sense; it is more of a communication with the public.
Publishers are generally open to the concept of lending, but some library associations want to create more exceptions to copyright laws, which entails risks such as jeopardization of existing publishing models, unfair competition and disruption of markets, legal exhaustion of rights granted by copyright law, and endangering the ability of rights holders to develop new methods of accessing content, and market cannibalization when copies intended for lending get into the marketplace. Publishers would therefore prefer to license rather than sell e-books to libraries. Digital lending is different and has no geographic boundaries. Machines enabling it are very useful, but the social environment is also important.
This session featured representatives of 6 startup organizations discussing their products.
- Unsub: Libraries want to cut their Big Deals because many of the journals they get are now available in OA. There has never been a better time for this, but there is a lot of fear, uncertainty, and doubt. Unsub helps to optimize the alternatives and answer questions like: What is the usage? What would cancellation look like? If we cancel, how much better access would we get at lower cost? It is now used in more than 300 libraries.
- Reshare uses the NISO 1866 standard to allow lending of digital books. It is open source and supports static and time-based media as well as content from commercial vendors, the same functionality as with physical books. Its next step is to develop workflows needed for library staff.
- arXiv Labs for community innovation is for people who want to use arXiv data or enhance its features. There are 361,655 unique authors in arXiv, and their works have had about 1.5 million downloads.
- SciFlow is an online collaborative editor made for publishing. It enhances preparation of papers for final publishing, has advanced grammar and spell checking, which frees the researcher from the details of editing and allows them to concentrate on the research.
- OA Books Toolkit is a single trusted resource for book authors to understand OA book publishing. Many authors are skeptical about OA books; the Toolkit is for them. It has multiple entry points and has had over 20,000 views in its first few months.
- OA Switchboard: Many people feel that the transition to OA is not going as fast as it could. It is a complex and administrative burden. The Switchboard provides a neutral communication platform. The issues are transparency, prohibitive cost, and heated topics. Communication works better with intermediaries and provides practical working solutions.
Miles Conrad Memorial Lecture
G. Miles Conrad was Director of Biological Abstracts (now BIOSIS Previews) in the 1950s. In 1957 he organized a meeting of 14 abstract and indexing services to discuss the implications o government investments in scientific communications, which led to the formation of the National Federation of Abstracting and Indexing Services (NFAIS). After his death in 1964, the Miles Conrad Award and Memorial Lecture was established in his memory. Recipients have included leaders in the information industry, and the lecture became a highlight of NFAIS and now NISO meetings. This year’s lecturer was Heather Joseph, who has been Executive Director of SPARC (the Scholarly Publishing and Academic Resources Coalition) since 2005.
Since Heather joined SPARC, the landscape of openness has changed significantly, so it was not surprising that she entitled her lecture “In Pursuit of Open Knowledge” and began by reflecting that she has spent the last 32 years of her career working on scholarly communications and advancing a system of sharing that is open by default and equitable by design. All of her work is advocacy and rooted in a social justice context. Human rights can be manifested by people at every level of society. Access to knowledge is a fundamental human right and is highlighted by UNESCO, which said “Everyone has the right freely to participate in the cultural life of the community, to enjoy the arts and to share in scientific achievement and its benefits…” Access to knowledge is embedded in the UN’s Sustainable Development Goals. OA is a priority that facilitates the free flow of knowledge across national borders and is rooted in the principles of social justice. To be truly effective, our actions must embrace the 4 principles of social justice: access, participation, equity, and rights.
To illustrate these principles, here are 3 examples from Heather’s career:
- The American Astronomical Society made her comfortable with technology and knowledgeable about the business processes of publishing even before the internet evolved from the ARPANet. TeX and SGML exposed the role that markup languages played in converting text to be discoverable, searchable, and readable on the web. She learned about scientific communication, which is to help researchers communicate what they want, when they want it, and where they want it. The internet is an opportunity to do this better; our job is to share the information (the building blocks of knowledge), but the community determines the appropriate package.
- The American Association for Cell Biology had a new approach to sharing science information taking advantage of the power of the internet. Its director proposed an “e-biomed” proposal which would make works openly available to all. The proposal met with fiery opposition: editors (including Heather) were afraid of the proposal, but scientists liked it. Heather learned to consider carefully and shape the proposal to meet as many viewpoints as possible. The e-biomed system became PubMed Central, and the journal that Heather managed was the first one to host its content entirely on it. Many people had not heard of OA before that proposal. The OA environment is author-driven, subsidized by funders, and free to publish on and access. Authors retain ownership of their intellectual works, and peer review occurs when the community chooses.
- SPARC optimized using the internet to share research articles. The Budapest OA Declaration in 2015 led to the global OA movement and tied together the knowledge sharing and justice aspects. OA is the convergence of an old tradition and new technology to make possible the public good: the worldwide distribution of peer-reviewed journal literature with completely free and unrestricted access to it. It is important to recognize that the Declaration does not promote openness for open’s sake, but open to do specific things.
We are no longer talking about whether or why to get open, but how to get there. OA publishing has become the fastest growing segment of scholarly publishing, and acceptance of OA as a growth strategy is increasing. The UN and UNESCO have embraced OA for their global mission, and research funders are now among the leading advocates of OA. The Open Research Funders Group, a partnership of philanthropic organizations committed to open sharing of research outputs, is adopting and promoting openness and funds research as a part of its core mission.
Heather is often asked what she would change if she could return to the beginning of the OA movement, and she mentioned 2 things:
- Tackle the need to change incentives for OA earlier on. In the middle of the pandemic one of the first actions was to facilitate OA for all COVID articles so that they were fully machine searchable and available for text analysis. That database of articles has been downloaded over 150 million times! We should not have to wait for an emergency to create a corpus of machine-readable OA papers. We need strategies and solutions that address the whole picture: get better and more deliberate at looking inward.
- The name “OA” implies that all we care about is getting the information. But there is much more: we must enhance the global participation in knowledge production and dissemination, and particularly the equity aspect.
We have a large imbalance in the source distribution of articles in the literature. We need to think about large scale changes in improving the way we are disseminating scientific knowledge. We also need to think about improvements in the way we share knowledge on a systems level. We need to center equity and inclusivity decisions when we are making decisions on business models, technology, rights, behaviors to be rewarded and incentivized, and leadership or governance bodies. We can only make these choices by recognizing that every decision is critical. Once we make a decision, we lose our leverage for change, and inequities will become entrenched in the system. Knowledge sharing and access is a fundamental human right and cannot be treated as an after-thought. We must stop, slow down, and think deliberately.
We need to look at the barriers that we are inadvertently throwing up, such as language, etc., and be more inclusive of forms of scholarship beyond articles, and make a system that is more representative of the inclusivity that we want to include. Can we connect openness and recognition? We should stop treating incentives and knowledge sharing as a market.
Day 3 Keynote: Japan Science & Technology Agency’s (JST’s) Moonshot Goal 1
Dr. Norohiro Hagita, Chair and Professor, Osaka University of Arts, and Director of the Japan Science & Technology’s Moonshot Goal 1, noted that the goal of the Moonshot project is the realization of a society in which humans can be free of the limitations of body, brain, space, and time by the year 2050.
Moonshot is a bold new program for creating disruptive innovation, tackling challenges facing future society, and going beyond limits of technology without fear of failure. It has 7 goals; JST focuses on Goals 1, 2, 3, and 6.
Moonshot Goal 1 was started to overcome the challenges of Japan’s aging population, declining birth rate, and labor shortages, which will allow the elderly, nurses, and child caregivers to actively participate in society. Cybernetic avatar technology will make work and play available to everyone. The following 6 scenes are examples:
- Disaster relief: Over 1,000 cybernetic avatars will perform large scale and complicated missions in disaster sites. Users can conduct quicker rescue operations by consulting with international experts in cyberspace meetings while working in a physical space.
- Sports: Enjoy sports together regardless of age, physical limitations, or where you live.
- Holidays: Have a full holiday. Everything can be done while relaxing on the beach. You could take lessons from a pianist in cyberspace or enjoy a live performance of your favorite popstar.
- Health and longevity are protected by avatars. Farming can be done from anywhere. If my grandchildren live far away, I can see them anytime, so every day is fulfilling. Healthy life expectancy is extended and elderly people can continue to play an active role in society because it is possible to prevent and treat diseases with avatars.
- Creativity is maximized. Large-scale artworks can be created by one artist alone using an avatar, or multiple avatars can be controlled by multiple simultaneous users.
- Travel: Go anywhere with avatars; users feel the same sensations as with their bodies, therefore expanding the possible range of human activities. So we can work in diverse places and enjoy spontaneous travel by renting a local avatar while staying at home. Large scale complex tasks can be completed in a short time by remotely operating 10 or more robots.
Three targets have been established to achieve Goal 1:
Target 1: Avatar infrastructure for diversity and inclusion
- Development of technologies and infrastructure to carry out large scale tasks: one person can operate up to 10 avatars at once.
- Virtual reality will give us the ability to move back and forth seamlessly between cyber and physical space will allow us to enjoy new lifestyles and new experiences, reduce time and money spent on travel, and minimize the risks associated with overcrowding.
- Augmenting body, cognition and perception for a fulfilling life. Capabilities that have diminished due to aging and illness will be augmented with cybernetic technology to promote more social activity.
Target 2: Cybernetic avatar life
- Development of technologies that will allow anyone willing to increase their physical, cognitive, and perceptional capabilities to the top level. By 2050, our lifestyles will have dramatically changed. We will have greater freedom in our choice of location and how we spend our time.
- We must consider ethical, legal, social, and economic issues.
Similar projects are occurring all over the world. One of those is ANA Avatar XPRIZE which has studied space, time, and physical human limitations.
Identifiers, Metadata, and Connections
Hocus, Pocus. Mixing Open Identifiers
This presentation began the third day of NISOPlus 2021 and consisted of representatives from several organizations discussing how identifiers work in our ecosystem. Rachel Lammey, Head, Special Programs, Crossref, said that mixing open identifiers into metadata makes connections between research work. Persistent Identifiers (PIDs) are for researchers, organizations, and things like outputs, preprints, reviews, etc. Sometimes these PIDs exist separately, but connecting everything reveals the true power of PIDs and relies on community-driven open identifiers and metadata. Crossref has a metadata collection of over 121 million items and makes them openly available. Humans do not do well in isolation; we thrive when we are connected to other people and communities. Metadata plus identifiers helps put research in context, enhance its discoverability, see the reviews it has had, concerns raised, and gives credit to the reviewers.
Gabby Meijas, Engagement Manager for Europe, ORCID, said that ORCID identifiers identify for over 10 million individuals. The identifier is connected to their research activities, affiliations, and funders, and improves transparency and trust in researchers. Many organizations are now incorporating ORCID identifiers in their records to identify individuals. The data are interoperable: it is entered once and used often. Data provenance is very important, which is why the dates of adding and modifying the data are displayed. Works can also be connected to the records; 43 types of contributions are in the database. The road to open research is never complete, and we need to work together as a community.
Maria Gould, Product Manager and Research Data Specialist, California Digital Library, discussed the Research Organization Registry (ROR) which contains identifiers for organizations. The scholarly infrastructure is a set of pieces that we are trying to fit together to show a complete picture of research activities. The three main pieces are the researchers, their outputs, and their affiliated institutions. Finding and tracking research done by an organization is difficult because of name inconsistencies. ROR is the first identifier designed to be integrated into an open research infrastructure. It has 99,000 identifiers for organizations around the world. Everything in ROR is open and freely available and supported by DataCite and ORCID.
Helena Cousijn, Director of Community Engagement at DataCite, said that DataCite is a DOI registration agency that exists to connect research in order to identify knowledge. Assigning a DOI allows research to enter the ecosystem and be discovered, tracked and reused. When an ROR member registers a DOI, they also register the associated metadata. Then a PID graph can be drawn to show more connections such as the contributors to the research output, funders, users of the research, etc. Everyone in the industry should use PIDs for all entities, track and record connections between the PIDs, and make connections openly available.
Deborah Wyatt, Vice President, Global Academic & Society Relations, Impact Science, part of Cactus Communications, said that Cactus brings research to life in new ways and brings communities together around knowledge and ideas. There has been a major evolution in how we can see and share information with each other. The internet has enabled connections that were not available previously. We now have a shared responsibility to work together and ensure that trusted research can be accessible to those that need it. The current pandemic is an example of our need to bring global multidisciplinary teams together.
The first questions that authors, publishers, and editors often ask are “How do we make research accessible so it will be seen?” and “How do we share content reliably and accurately?” It is not enough for an article to just be published; a video provides a simple summary of a study and shows the key data accurately so it can be understood within minutes. Feature articles and podcasts are very powerful in encouraging greater diversity and are widely used. We need to start regarding these formats as core components of research publishing. We need to rethink about what research communication really means. The Open Scholarship Initiative (OSI) is a huge step in that direction.
We also need to help researchers make their work understandable to a non-academic audience in addition to ensuring that the content is accurate and the facts are not diluted or distorted. When communication is done correctly, citations and altmetrics will be increased. We can determine how the information is being shared by looking closely at new formats. AI can match relevant research to the right audience, so Cactus recently acquired an AI research startup to create the necessary infrastructure. Research can change lives and solve global challenges, so it must be FAIR and open.
Harini Calimur, Head, Impact Science, said that Impact is hoping to ensure that communication is discoverable and reusable. We need to involve the researchers in every step which is done by tailoring the communication to their needs. And of course, we must make sure that the work is attributed to all the right people.
Dario Rodighiero, a researcher at Harvard University, demonstrated how to map a conference using NISOPlus as his example. His work combines visualization and natural language processing. When conferences went online, attendees felt a need for new ways to orient themselves. In his map, speakers are placed on an imaginary topographical terrain. Terms are determined by text analysis of the presentations. Here is Dario’s map of NISOPlus 2021.
We can see that the speakers are connected by many terms from information and library science, research, and publishing. When you zoom in, you can see the speakers. This map shows Dario and his connections and also their relevant subjects.
It can function as an instrument for speakers to determine their lexical content, for attendees who can find talks by keywords, and for the conference committee to arrange panel discussions. Open data makes it easier to create a visual mapping, increase information precision, and create scientific awareness among scholars
Dario is working on a book to be published this year in French and English. The English version will be titled Mapping Affinities and will be open and accessible on the internet.
Misinformation and Truth: From Fake News to Retractions to Preprints
Can open access play a role to fight fake news?
Sylvain Messip, CEO, Opscidia, a French startup formed to help promote the reusability of research results in society as a whole, noted that research results are not well used outside of academia because people generally do not have access to information behind paywalls; it is difficult to verify research results especially if there is only 1 article on a subject; and the results may not be discoverable. With 2 million articles published every year, it is very difficult to find the right article containing the desired information.
Opscidia has a Diamond OA database with a CC-By license, and is APC free, and Open Source. They have built a prototype to do scientific “fact checking” for R&D companies, government offices, etc. by text analysis using OA articles from the PubMed database to see if claims in a research paper are confirmed or contradicted. There is usually no single study that can settle the debate so conclusions of different studies must be checked. Indicators are then derived:
- Analysis of the sources—a standard way to fight fake news. It is generally easy to find where an article comes from which may be enough to tell the whole story. Sometimes an article is retracted, which is another indication. A plot of the number of articles published each year can also be revealing; a sudden jump indicates that something significant happened.
- Semantic analysis which means classifying articles with respect to the input statement to determine whether the article is supporting, neutral, or contradicting the hypothesis being studied.
- Numerical data retrieval to help people explore results and visualize them which gives an interface to explore the literature on a specific topic.
So the conclusion is that OA can be useful in detecting fake news.
Reducing the Inadvertent Spread of Retracted Science
Jody Schneider, Assistant Professor, University of Illinois Urbana-Champaign, described the RSIRS2020 project that is working on the problem of retracted articles continuing to have significant exposure after they were retracted. A series of workshops of invited participants was held in which the problems caused by retractions and possible solutions were considered.
Randy Townsend, Director, Publishing Operations, American Geophysical Union (AGU) noted that he is often contacted by people who think something is not right with a particular article published in an AGU journal, and they feel that systems of trust and professionalism have been jeopardized. Publishing has safeguards to protect the integrity of the content. Retractions were long believed to be career ending, and authors felt shamed; one author committed suicide after his article was retracted.
AGU has 22 peer reviewed journals in which nearly 38,000 authors publish articles that are reviewed by nearly 16,000 reviewers. The harms of retracted research are reputation damage, scientific dissonance, professional damage, and feelings of failure. Researchers build on previously published work, and when a link in that chain is broken, they move away from the truth. When an error is detected, authors are given the option of retracting or correcting the article, but even if it is retracted, it may continue to be cited.
Caitlin Bakker, Research Services Coordinator, University of Minnesota Health Sciences Libraries, said that from an information seeker’s perspective, the primary harm of retracted research is its potential use without knowledge of the retraction or the reason for it. One question that is frequently asked is “Aren’t retractions enough? Shouldn’t the retraction correct the scholarly record?” Unfortunately the answer is “no”. For example, many doctors turn to review articles when they need information on a topic because they do not have time to check the primary literature. One pharmacy review journal contained 1,396 retracted articles, of which 283 were cited 1,096 times, and 384 of those citations occurred after the article was retracted and the retraction notice was published. Of course, citing a paper does not signify agreement; to refute an article, it must be cited. But frequently, only a few cited articles refute retracted papers, which indicates that people are using retracted articles without knowledge of their retracted status. In a study of 144 articles in 7 databases, 40% of the references did not indicate they were retracted. Although 83% of the entries in the databases had an associated retraction notice, it was not linked to the original article or indexed with the subject headings. As librarians, we can account for retractions in knowledge production workflows, address inconsistencies in data display and transfer between systems, and educate end users about identifying and using retracted information.
Hannah Heckner, Product Strategist, Silverchair, said that the job of a platform is to act as a container for its products, disseminate information to users (humans or machines) with clarity, and allow publishers to communicate their brand while cultivating loyalty and trust. There is a lot of inconsistency in the way publisher sites communicate the retracted status of an article as well as a lot of opportunity around using the metadata or added tags to communicate retracted research. Impediments to dissemination can be reduced or eliminated by communicating that platforms are collaborating with abstracting and indexing services, libraries, and publishers to maintain safety around content and allow for reading and dissemination of research.
Michelle Avisar-Whiting, Operations Director and Editor-in-Chief of the Research Square preprint platform, remarked that it is fascinating to see how people use information and to think about how we have been hosting early outputs, i.e. preprints. Most preprint servers are not “anything goes”—they do not host pseudoscientific submissions or those that are ethically dubious or potentially dangerous, but they do not block papers with methodological flaws, poor or opaque reporting, or specious conclusions. Here are examples of 3 types of preprints.
- The misunderstood. An article on viruses suggested because of common colds we are all immune to COVID, so the reaction to it is all a hoax. The article accumulated many downloads and views because it was widely repeated on Twitter.
- The overinterpreted. An article stated that requiring children to wear masks was child abuse. It did not stop people from promoting anti-mask narratives.
- The convenient truth. Vitamin D is all we need to prevent COVID, and the government is hiding this from us. It is also the reason that minorities are disproportionately affected by the virus. Both the preprint and the published article were used to promote some strange conspiracy theories.
The lessons to learn from these 3 examples are that the spotlight is on preprints now because they were the first articles to appear during the pandemic. We must screen out the bad articles and put disclaimers on the others. This diagram shows the 5 components of rigor to determine trust, and we can use them at the preprint stage.
Some articles are withdrawn quickly after being published, but we are still living with the awful Wakefield MMR autism study that was published in The Lancet over 20 years ago causing significant fear and concern but was not retracted for 10 years.
Preprints are not without their problems, but the problems are not intractable. We have a great resourceful community that will find solutions
Closing Keynote: What Does the Pandemic Teach Us About Trust, Reliability, and Information?
Dr. Zeynep Tufekci, Associate Professor, School of Information and Library Science at the University of North Carolina Chapel Hill, and author of Twitter and Tear Gas (Yale University Press, 2018) began her keynote address by observing that we have a public sphere very much geared toward attention, and it has become individualized. Digital public fear has increased the demand for information which has affected our response to the COVID pandemic. Group and social dynamics have mixed with our attention; for example, in February 2020 Wuhan, China had been shut down for a month and COVID cases were popping up all over the world. But here, the attitude was “What’s the big deal? Don’t worry. Keep calm and wait for the evidence. Don’t panic.” We were lectured for being panic-prone, that COVID was no worse than the flu. The Trump Administration played a key role in this denial.
That is how group think works: people have an identity which confirms their belief. Some public health officials tried to warn us and were silenced. It was the same thing with masks: we were told it was harmful to wear one. Finally, ranks were broken. We got evidence, and the information landscape switched to taking the pandemic seriously. Individual actions became crucial: wearing a mask, staying home, even prohibiting outdoor events like going to a park, the beach, etc. where there was little evidence of the spread of the virus.
We now have a different group think, and it has become hard to communicate positive things like the vaccine. Small things have become treated as if they spell doom. We may think this only affects other people, but these are general human phenomena. The pandemic has been useful because it has given us a stress test that has allowed us to see these human dynamics. Many people fell prey to this, even people we trust. Mere fact checking is not a solution. What happens when the authorities you rely on turn out to be unreliable, like the WHO? Was it misinformation to contradict the WHO? What is out of the bounds of reasonable scientific disagreement? Even people with excellent credentials like a Ph.D. are saying nonsense. How do we navigate such an information environment? No single scientist is right all the time: no matter what position you take, there is a contrarian, which is all a process. We cannot fit into boxes but must find a way to iterate, get better, and consider the conditions under which we can convince people. Some things require different approaches.
SARS became significant because it came out that China was suppressing information. When Wuhan was shut down, that was a big sign: they were telling the whole world that this was a big deal. Right after that medical journals were flooded with papers from China which indicated that this was going to be a pandemic. This information was available to everyone, but many countries had different responses.
What can be done if there is a future epidemic? What did we do that caused some of the bad outcomes this time? It’s not confined to the US; some places in Europe had terrible responses. We have no monopoly on science. Many of the responses were due to blind spots and failings. We should have a task force and not blame scientists but study the responses to find ways to fix problems.
Could social media learn from scholarly publishing? Preprints and rapid peer review have been excellent. Comments can be very helpful if they are specific. Fraudulent papers can make us suspicious of legitimate ones, so we must be careful. There has never been a better time to be informed or misinformed!
Can we find ways to increase the density of good information? We need to know what to follow and look for, and find a healthier way to be open to challenges. Some disagreements have become vicious fights on social media; we need a platform that will encourage structured debate so we can clarify and get to the heart of the disagreements.
Plans for NISOPlus 2022 are beginning and will be announced later.
Donald T. Hawkins is an information industry freelance writer based in Pennsylvania. In addition to blogging and writing about conferences for Against the Grain, he blogs the Computers in Libraries and Internet Librarian conferences for Information Today, Inc. (ITI) and maintains the Conference Calendar on the ITI website (http://www.infotoday.com/calendar.asp). He is the Editor of Personal Archiving: Preserving Our Digital Heritage, (Information Today, 2013) and Co-Editor of Public Knowledge: Access and Benefits (Information Today, 2016). He holds a Ph.D. degree from the University of California, Berkeley and has worked in the online information industry for over 50 years.