By: Nancy K. Herther, writer, consultant and former librarian
Citation analysis research has been progressing at an incredible speed. Certainly the web, internet access and digitization have played a major role. However, much of the energy and development is happening at the level of the Internet Generation – folks who never have had to work through the pre-existing Web of Science and are, perhaps, able to provide new, fresh insights and approaches that traditional companies have not been able to perceive. Many of these ideas and companies are probably apt to be absorbed into larger, more established companies as time goes on – but, maybe not. There is energy and intelligence that is happening bottom-up today that is changing our information world.
Mason Hayes is one such amazing example of how this new generation is taking a fresh look at information and applying data science to make research assessment and access more readily available and malleable than ever before. Mason Hayes received his B.A. in Economics and International Studies last May and is current in the Masters program in Economics at the French Toulouse School of Economics.
Mason’s academic research projects include: “capital concentration in Argentina at the beginning of the 20th century, populism and economic inequality in Turkey, the optimization of pay-what-you-want pricing models, and the relationship between increasing industrial concentration and inequality in the United States.” Additionally he is now “working part-time as a Marketing & Business Consultant at scite.”
ATG contacted him to get his perspectives on scite, citation analysis and the future role of scite in this ecosystem.
NKH: In your recent research paper on scite, you wrote “researchers and even entire journals have been found to engage in citation manipulation, or purposely and artificially increasing the citation count for a given article, journal, or author,” and that “manipulating one’s own h-index has been shown to be quite easy.” How easy is it to police these efforts today? Are journals, institutions and professional societies doing an adequate job?
MH: Regulating this type of manipulation is challenging, of course. In regard to how journals, institutions or publishers are currently attempting to solve this, I do not know; it is not something I have studied, so I will defer judgment here.
However, I can say that I think that citation manipulation is not at all the central problem with how citations currently work in academia/science. The main issue is that citation indices are based on the number of citations, and we lose all context or meaning of those citations. For example, a study could be cited 100 times with each citing article failing to support the study’s findings, and we would know just as much as if it was supported 100 times. Clearly, there is a lot of information that we are missing out on if we analyze citations in this way. This is exactly the problem that scite is trying to solve in scientific research by providing a quick and powerful way to analyze citation statements in their full context.
NKH: You note in your bio that you “have created an R package, sciteR, to gather journal-level citation data from scite, a Shiny web app to track the spread of COVID-19, and a simple R script to quickly visualize cryptocurrency price trends.” Did you find this an ‘easy’ adaptation process? Were the folks at scite able to provide any support/advice/guidance?
MH: (Here I’ll refer only to the R package, sciteR, since the others are simply personal projects unrelated to scite.) In my final semester of undergrad where I was studying Economics, I took a Data Science course. At the time I was struggling to come up with a topic for the final research project. I spent some time searching for papers that might inspire me to dig deeper into a specific topic; it was funny because some of the papers that I was searching for, I would check them on scite to see how others had cited them. But it wasn’t until later that the thought occurred to me –how about using scite’s data for the Data Science project?
The scite team was super helpful from the start. They allow free use of their data for non-commercial research purposes, and after sending an email asking for permission to use their data, they quickly helped me getset up with the scite API, and that was that. Before the project I had no idea how to make an R package, and I don’t even think I knew what “API” stood for. But the scite team was super helpful and sent over instructions on everything I would need to get started!
From there I wrote an R package for my own personal use, to make it easy to gather the data I wanted for the project. To be clear, the R package is unaffiliated with scite –I made it just for myself before I was working with scite, at the time I was just an undergraduate student interested in what scite was doing to make science more reliable.
NKH: Since the article I’m working on is looking at scite, would you be willing to comment on the idea and contribution of scite to scientific credibility and progress? The system continues to develop and evolve. Your thoughts on the potential for scite-based systems and their future evolutions?
MH: The main, central idea of scite is simple: make science more reliable. And the key to doing this is Smart Citations, which show how a paper has been cited by providing the context of the citation and a classification describing whether it provides supporting or disputing evidence for the cited claim.
For any published article, without scite you can see data such as, “This article has been cited 50 times,” and the values of some citation indices, but that’s about all the information you can get without reading every citing paper, which could take weeks. With scite, you can see within seconds that among those 50 citing papers there are, for example, 20 supporting citation statements, 10 disputing citation statements, and the remaining citations simply mention the article. And you can see each citation statement in its entirety, so you know exactly which claims are supported or disputed in an article. Of course, an article can be cited in more than one way: maybe others have supported one finding of the article, but they dispute another finding. This makes it very easy even for non-experts to digest an article piece-by-piece, and in particular it allows researchers to analyze the quality of an article much faster.
That’s one thing I love about scite; despite how powerful of a tool it is, the barriers to using it are really low. Anyone from high school students to doctoral students to Nobel Prize winners can benefit from using scite in their research.
I think that scite is still in the very early phase of its contributions to scientific research. From the nature of what scite does –indexing full-text articles and classifyingeach citation statement –it has very large network effects. Meaning, the more full-text articles that scite indexes, the more citation statements there are in the database and the more valuable it becomes for academic publishers, researchers, students, and anyone who is interested in science. scite has some big indexing agreements lined up this year, and I think it is safe to say we’re on track to pass 1 billion citation statements by the end of 2021.
NKH: I know you are now a Master’s student at the Toulouse School of Economics, and you state that you have worked with scite folks. Could you comment on your perceptions of the company, its leadership and staff?
MH: In November 2020 I started as a Marketing and Business Intern at scite. For me, starting to work with scite has been amazing. This is a tool that I have been using and recommending to people for over a year now, and I never thought I would be working here! So when I saw that scite was looking for an intern I applied the next day. The first three months have been challenging but exciting. I still have a lot to learn as I go.
It is really an incredible company and a great team. The entire team is working remotely, so I haven’t met any of them in person yet, but everyone has been very welcoming and supportive. It will be nice to finally meet the whole team in person when the situation allows it.
Since it’s a small team, it is easy to get involved in different projects. It’s a great feeling that even though I’m just an intern, any suggestions or ideas I have are considered by the company. I’m very excited to be a part of scite, and to see how the company grows and evolves over the coming years.
NKH: Thank you for your time and energy! Keep up the great work!
There is so much amazing developments in research metrics today and, clearly, the next generation is getting things well in-hand to move citation analysis into the 21st century! I think Gene Garfield would be mighty pleased!
Nancy K. Herther, writer, consultant and former librarian with the University of Minnesota Libraries