On exploding ‘evaluation machines’ and the construction of alt-metrics

The emergence of web-based ways to create and communicate new knowledge is affecting long-established scientific and scholarly research practices (cf. Borgman 2007; Wouters, Beaulieu, Scharnhorst, & Wyatt 2013). This move to the web is spawning a need for tools to track and measure a wide range of online communication forms and outputs. By now, there is a large differentiation in the kinds of social web tools (i.e. Mendeley, F1000,  Impact Story) and in the outputs they track (i.e. code, datasets, nanopublications, blogs). The expectations surrounding the explosion of tools and big ‘alt-metric’ data (Priem et al. 2010; Wouters & Costas 2012) marshal resources at various scales and gather highly diverse groups in pursuing new projects (cf. Brown & Michael 2003; Borup et al. 2006 in Beaulieu, de Rijcke & Van Heur 2013).

Today we submitted an abstract for a contribution to Big Data? Qualitative approaches to digital research (edited by Martin Hand & Sam Hillyard and contracted with Emerald). In the abstract we propose to zoom in on a specific set of expectations around altmetrics: Their alleged usefulness for research evaluation. Of particular interest to this volume is how altmetrics information is expected to enable a more comprehensive assessment of 1. social scientific outputs (under-represented in citation databases) and 2. wider types of output associated with societal relevance (not covered in citation analysis and allegedly more prevalent in the social sciences).

Our chapter we address a number of these expectations by analyzing 1) the discourse in the “altmetrics movement”, the expectations and promises formulated by key actors involved in “big data” (including commercial entities); and 2) the construction of these altmetric data and their alleged validity for research evaluation purposes. We will combine discourse analysis with bibliometric, webometric and altmetric methods in which both methods will also interrogate each others’ assumptions (Hicks & Potter 1991).

Our contribution will show, first of all, that altmetric data do not simply ‘represent’ other types of outputs; they also actively create a need for these types of information. These needs will have to be aligned with existing accountability regimes. Secondly, we will argue that researchers will develop forms of regulation that will partly be shaped by these new types of altmetric information. They are not passive recipients of research evaluation but play an active role in assessment contexts (cf. Aksnes & Rip 2009; Van Noorden 2010). Thirdly, we will show that the emergence of altmetric data for evaluation is another instance (following the creation of the citation indexes and the use of web data in assessments) of transposing traces of communication into a framework of evaluation and assessment (Dahler-Larsen 2012, 2013; Wouters 2014).

By making explicit what the implications are of the transfer of altmetric data from the framework of the communication of science to the framework of research evaluation, we aim to contribute to a better understanding of the complex dynamics in which new generation of researchers will have to work and be creative.

Aksnes, D. W., & Rip, A. (2009). Researchers’ perceptions of citations. Research Policy, 38(6), 895–905.

Beaulieu, A., van Heur, B. & de Rijcke, S. (2013). Authority and Expertise in New Sites of Knowledge Production. In A. Beaulieu, A. Scharnhorst, P. Wouters and S. Wyatt (Eds.), Virtual KnowledgeExperimenting in the Humanities and the Social Sciences. (pp. 25-56). MIT Press.

Borup, M, Brown, N., Konrad, K. & van Lente, H. 2006. “The sociology of expectations in science and technology.” Technology Analysis & Strategic Management 18 (3/4), 285-98.

Brown, N. & Michael, M. (2003). “A sociology of expectations: Retrospecting prospects and prospecting retrospects.” Technology Analysis & Strategic Management 15 (1), 3-18.

Costas, R., Zahedi, Z. & Wouters, P. (n.d.). Do ‘altmetrics’ correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective.

Dahler-Larsen, P. (2012). The Evaluation Society. Stanford University Press.

Dahler-Larsen, P. (2013). Constitutive Effects of Performance Indicators. Public Management Review, (May), 1–18.

Galligan, F., & Dyas-Correia, S. (2013). Altmetrics: Rethinking the Way We Measure. Serials Review, 39(1), 56–61.

Hicks, D., & Potter, J. (1991). Sociology of Scientific Knowledge: A Reflexive Citation Analysis of Science Disciplines and Disciplining Science. Social Studies of Science, 21(3), 459 –501.

Priem, J., Taraborelli, D., Groth, P., and Neylon, C. (2010a). Altmetrics: a manifesto. http://altmetrics.org/manifesto/

Van Noorden, R. (2010) “Metrics: A Profusion of Measures.” Nature, 465, 864–866.

Wouters, P., Costas, R. (2012). Users, narcissism and control: Tracking the impact of scholarly publications in the 21st century. Utrecht: SURF foundation.

Wouters, P. (2014). The Citation: From Culture to Infrastructure. In B. Cronin & C. R. Sugimoto (Eds.), Next Generation Metrics: Harnessing Multidimensional Indicators Of Scholarly Performance (Vol. 22, pp. 48–66). MIT Press.

Wouters, P., Beaulieu, A., Scharnhorst, A., & Wyatt, S. (eds.) (2013). Virtual Knowledge – Experimenting in the Humanities and the Social Sciences. MIT Press.

Vice Rector University of Vienna calls for a new scientometrics

At the opening of the bi-annual conference of the International Society for Informetrics and Scientometrics (ISSI) in Vienna on July 16, Susanne Weigelin-Schwiedrzik, the Vice Rector of the University of Vienna called upon the participants to reorient the field of scientometrics in order to better meet the need for research performance data. She explained that the Austrian universities nowadays are obliged by law to base all their decision regarding promotion, personnel, research funding and allocation of research funds to departments on formal external evaluation reports. “You are hosted by one of the oldest universities in Europe, it was founded in 1365. In the last couple of years, this prestigious institute has been reorganized using your scientometric data. This puts a tremendous responsibility on your field. You are no longer in the Kindergarten stage. Without your data, we cannot take decisions. We use your data to allocate research funds. We have to think twice before using your data. But you have the responsibility to realize your role in a more fundamental way. You also have to address the criticism of scientometric data. And what they represent.”

Weigelin’s passionate call for a more reflexive and critical type of scientometrics is motivated by the strong shift in Austrian university policy with respect to human resource management and research funding. In the past, the system was basically a closed shop with many university staff members staying within their original university. The system was not very open to exchanges among universities, let alone international exchange. Nowadays, the university managers need to explicitly base their decisions on external evaluations, in order to make clear that their decisions meet international quality standards. As a consequence, the systems of control at Austrian universities have exploded. To support this decision making machinery, the University of Vienna has created a specific quality management department and a bibliometric department. The university has an annual budget 380 million Euro and needs to meet annual targets that are included in target agreements with the government.

On the second day of the ISSI conference, Weigelin repeated her plea in a plenary session on the merits of altmetrics. After a couple of presentations by Elsevier and Mendeley researchers, she said she was “not impressed”. “I do not see how altmetrics, such as download and usage data, can help solve our problem. We need to take decisions on the basis of data on impact. We look at published articles and at Impact Factors. As a researcher, I know that this is incorrect since these indicators do not directly reflect quality. But as a manager, I do not know what to do else. We are supposed to simplify the world of science. That is why we rely on your data and on the misconception that impact is equal to quality. I do not see a solution in altmetrics.” She told the audience, which was listening intently, that she has a constant flow of evaluation reports and the average quality of these reports is declining. “And I must say that a fair amount of the reports that are pretty useless are based on scientometric data.” Nowadays, Weigelin is no longer accepting recommendations for promotion of scientific staff that are only mentioning bibliometric performance measures without a substantive interpretation of what the staff member is actually contributing to her scientific field.

In other words, at the opening of this important scientometric conference, the leadership of the University of Vienna has formulated a clear mission for the field of scientometrics. The task is to be more critical with respect to the interpretation of indicators and to develop new forms of strategically relevant statistical information. This mission resonates strongly with the new research program we have developed at CWTS. Happily, the resonance among the participants of the conference was strong as well. The program of the conference shows many presentations and discussions that promise to at least contribute, albeit sometimes in a modest way, to solving Weigelin’s problems. It seems therefore clear that many scientometricians are eager to meet the challenge and indeed develop a new type of scientometrics for the 21st century.

Prospects of humanities bibliometrics? – Part 2

In the empirical chapters of this PhD thesis Following the Footnotes,  Björn Hammarfelt tries out several methodological innovations. In a particularly interesting chapter, Hammarfelt traces the citations to one book, Illuminations by Walter Benjamin, and he tries to map the disciplines that cite the book. But he goes further and even teases out the specific parts of the book that are cited. He calls it Page Citation Analysis which basically is a form of Citation Context Analysis as proposed by Suzan Cozzens (1985). Alternatively, one could see this as going back to the very old philological tradition which was also engaged with the precise location of particular textual phenomena. Historically, bibliometrics is humanities research. In a preceding chapter, Hammarfelt analyzes the intellectual base of literary studies by analyzing the references from 34 literature journals. The way he collected these journals is interesting because he departed from the point of view of the literary researcher (and not the available database, a mistake often made in bibliometrics). He advocates the mixed use of citation analysis and library classifications, also an issue underdeveloped in the field of bibliometrics. Another interesting innovation is in the last empirical chapter where Hammarfelt analyzes research grants and compares them with journal publications. He argues that research grants are increasingly important in the life of a scholar and hence it would make sense to use them much more as a data source. He also notes the partly different role of references in grants compared to journal articles.

The various analyses in Hammarfelts thesis confirm what we generally know about the humanities as a set of disciplines. He tries to characterize the nature of the humanities by drawing upon theories from the sociology of science, in particular the theories that try to explain the social structure as well as the intellectual structure of working practices and communication patterns in these fields. His theoretical exposé focuses on the theories proposed by Richard Whitley and Becher and Trowler, but he also uses various theories that can be captured under the umbrella of “mode 2 knowledge production”.

Hammarfelt’s thesis confirms that the humanities can be characterized as divergent (rather than convergent), as very wide ranging and interdisciplinary (proven by the reference analysis performed by Hammarfelt), as rural in the sense of sparsely populated villages of researchers focusing on a particular topic (rather than busy laboratories that are more like bustling cities), and as fragmented (scholars are quite independent of each other which gives them a lot of freedom, but also makes it more difficult to speak with a common voice regarding resources). The interdisciplinary nature of literary studies seems to be rising, and Hammarfelt even detects a turn to the social, because he sees more connections between the humanities and specific fields in the social sciences such as gender studies, and post-colonial studies. So there might be a social turn after the linguistic turn some decades ago. Overall, the thesis concludes that three features are most important to understand referencing and citations in the humanities: the strong independence of humanities scholars; the rural organization; and the diverse audiences of the fields. The latter goes together with a rather low codification of the literature because it needs to be understood by a wide range of people.

What does this mean for the ways humanities scholars are being assessed in evaluation protocols? This topic is discussed in a variety of ways in the thesis. In the last part, Hammarfelt brings it together with a sketch of how bibliometrics could further develop to be of use in this more policical sense. He advocates a role here, because bibliometrics can help unsettle existing power structures, like the lack of diversity in the universities. He specifically mentions the gender bias, the need to support interdisciplinary research, and the problems with the current peer review systems. But bibliometrics needs to change in order to play this role.

Reference:

Cozzens, S. E. (1985). Comparing the Sciences: Citation Context Analysis of Papers from Neuropharmacology and the Sociology of Science. Social Studies of Science, 15, 127-153.

Prospects of humanities bibliometrics? – Part 1

Humanities scholars are often confronted with the limitations of citation analysis in their fields. Often, the number of articles is too low to do any meaningful statistical analysis of publication or citation patterns. Moreover, many forms of publishing in the humanities (books, national journals, movies, dance performances etc.) are not covered in the bibliometric databases. As a result, both humanities scholars and bibliometricians often advice against using citation analysis in most fields in the humanities, in particular for evaluation.

So does this mean that a form of bibliometrics better fitted to the humanities is impossible? Not so, says Björn Hammarfelt, who recently received his PhD degree at Uppsala University for this thesis “Following the Footnotes”.

In this well written and innovate thesis, Hammarfelt combines three different intellectual, and I would say also political, interests: literary studies, bibliometrics, and the sociology of science. His title “Following the Footnotes” has three different connotations. First, he tries to trace the creation, role and institutionalization of the reference, a specific element of a scholarly publication. Examples of references are footnotes, but one can also mention a document by putting the author’s name and the publication year in brackets and then summarize these documents at the end of the article, and one can also have references as endnotes. Hammarfelt argues that these seemingly technical differences have an important meaning and that we should pay careful attention to their format and position in the text. He also claims that reference practices in literary studies and in the humanities more generally have a partly different character than these practices in the natural sciences because the act of writing has a different role. To put it a bit bluntly, in the natural sciences an author is supposed to report “facts of nature” and obliterate herself from the text, in the humanities a scholar is expressing herself as a creative persona, and very visibly so. Both are rhetorical strategies, neither of them is a simply reflection of the reality of scholarly practice.

In the second meaning of the title, Hammarfelt tries to give us a glimpse of what will follow the footnote now that the landscape of scholarly publishing is developing so fast. Although his thesis does not focus on new forms of referencing (such as in Facebook, Twitter, or Spotify), it does give us insight in the current practices on which the future will build. The third meaning of the title is more fully developed: what follows the footnote if these are translated into citations and into academic reputation for the author. Here the thesis deals with the very important topic of research evaluation and how the humanities are currently subjected to regimes of evaluation and evaluation cultures that do not always seem to be aware of the specific epistemic and social characteristics of the humanities. In the concluding chapter of the thesis, Hammarfelt comes back to this with a number of suggestions regarding the application of bibliometrics in the evaluation of the quality and impact of humanities research. His thesis is an exercise in how a humanities bibliometrics might look like and what methodological and theoretical issues are important to develop a humanist oriented bibliometrics.

“Looking-glass upon the wall, Who is fairest of us all?” (Part 4)

In our last post, we discussed four arguments in favour of alternative metrics (more details can be found in our recent report on altmetrics “Users, narcissism, and control”. To recapitulate, the four arguments are: openness, speed, scholarly output diversity, and the measurement of more impact dimensions. How do these arguments relate to the available empirical evidence?

Speed is probably the weakest argument. Of course, it is seductive to have the feeling to be able to monitor “in real time” how a publication reverbates in the communication system. The Altmetrics Manifesto (Priem, Taraborelli, Groth, & Neylon, 2010) even advocates the use of “real-time recommendation and collaborative filtering systems” in funding and promotion decisions. But how wise is this? To really know what a particular publication has contributed takes time, if only because the publication must be read by enough people. Faster is not always better. It may even be the other way around, as the sociologist Dick Pels has argued in his book celebrating “slow science” (Pels, 2003).

Moreover – this relates to the fourth argument – we do not yet know enough about scholarly communication to see what all the measurable data might mean. For example, it does not make much sense to be happy about one instance of correlation between number of tweets and citations, if we do not fully understand what a tweet might mean (Davis, 2012). The role of early signalling of possibly interesting research may be very different from a later-stage scholarly citation. And different modalities of communication may also represent different dimensions of research quality. For example, a recent study compared research blogging in the area of chemistry with journal publications. It was found that blogging is more oriented towards the social implications of research, tends to focus on high-impact journals, is more immediate than scientific publishing, and provides more context of the research (Groth & Gurney, 2010). We need much more of these studies before we jump to conclusions about the value of measuring blogs, web sites, tweets etc. In other words, the fourth argument for alternative metrics is an important research agenda in itself.

This also holds for the third argument: diversity. Researchers write blogs, update databases, build instruments, do field work, conduct applied research to solve societal problems, train future generations of researchers, develop prototypes and contribute their expertise to countless panels and newspaper columns. All this is not well represented in international peer reviewed journals (albeit sometimes it is reflected indirectly). Traditional citation analysis captures an important slice of scholarly and scientific output, provided the field is well represented in the Web of Science (which is not the case in most humanities). Yet, however valuable, it is still only a thin slice of the diverse scientific production. Perhaps alternative metric will be able to reflect this diversity in a more satisfactory way than citation analysis. Before we can affirm that this is the case indeed, we need much more case study research.

This brings me to the last argument, openness. The two most popular citation indexes (Web of Science and Scopus) are both proprietary. Together with their relatively narrow focus, this has brought many scholars to look for open, freely accessible alternatives. And some think they found one in Google Scholar, the most popular search engine for scholarly work. I think it is indisputable that the publication system is moving towards a future with more open access media as default options. But there is a snag. Although Google Scholar is freely available, its database is certainly not open. On the contrary, how it is created and presented to the users of the search engine is one of the better kept secrets of the for-profit company Google. In fact, for the purpose of evaluation, it is less rather than more transparent than the Web of Science or Scopus. In the framework of research evaluation, transparency and consistency of data and indicators may actually be more important than free availability.

References:

Davis, P. M. (2012). Tweets, and Our Obsession with Alt Metrics. The Scholarly Kitchen. Retrieved January 8, 2012, from http://scholarlykitchen.sspnet.org/2012/01/04/tweets-and-our-obsession-with-alt-metrics/

Groth, P., & Gurney, T. (2010). Studying Scientific Discourse on the Web using Bibliometrics: A Chemistry Blogging Case Study. Retrieved from http://journal.webscience.org/308/2/websci10_submission_48.pdf

Pels, D. (2003). Unhastening science: Autonomy and reflexivity in the social theory of knowledge. Routledge.

Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2010). altmetrics: a manifesto – altmetrics.org. Retrieved January 8, 2012, from http://altmetrics.org/manifesto/

“Looking-glass upon the wall, Who is fairest of us all?” (Part 3)

How do the conclusions in our recent report on altmetrics “Users, narcissism, and control” relate to the discussions about altmetrics? We found that four arguments are regularly mentioned in favor of new methods instead of the more traditional citation analysis.

Perhaps one of the best representatives of this body of work is the Altmetrics Manifesto (Priem, Taraborelli, Groth, & Neylon, 2010). The manifesto notes that traditional forms of publication in the current system of journals and books are increasingly supplemented by other forms of science communication. These include: the sharing of ‘raw science’ like datasets, code, and experimental designs; new publication formats such as the ‘nanopublication’, basically a format for the publication of data elements (Groth, Gibson, & Velterop, 2010); and widespread self-publishing via blogging, microblogging, and comments or annotations on existing work (Priem et al., 2010).

The first argument in favor of new impact metrics is diversity and filtering. Because web based publishing and communication has become so diverse, we need an equally diverse set of tools to act upon these traces of communication. The altmetrics tools build on their use as information filters to also start measuring some forms of impact (often defined differently from citation impact).

The second argument is speed. It takes time for traditional publications to pick up citations and citation analysis is only reliable after some initial period (which varies by field). The promise of altmetrics is an almost instant measurement window. ‘The speed of altmetrics presents the opportunity to create real-time recommendation and collaborative filtering systems: instead of subscribing to dozens of tables-of-contents, a researcher could get a feed of this week’s most significant work in her field. This becomes especially powerful when combined with quick “alt-publications” like blogs or preprint servers, shrinking the communication cycle from years to weeks or days. Faster, broader impact metrics could also play a role in funding and promotion decisions.’ (Priem et al., 2010).

The third argument is openness. Because the data can be collected through Advanced Programming Interfaces (APIs), the data coverage is completely transparent to the user. This also holds for the algorithms and code used to calculate the indicators. An important advantage discussed in the literature is also the possibility to end the dependency on commercial databases such as Thomson Reuters’ Web of Science or Elsevier’s Scopus. The difficulties that are entailed in the bottom-up creation of a completely new usage, impact, or citation index is however usually not mentioned. Still, this promise of a non-commercial index that can be used to measure impact or other dimensions of scientific performance should not be disregarded. In the long term, this may be the direction in which the publication system is moving.

The fourth argument is that many web based traces of scientific communication activity can be used to measure aspects of scientific performance that are not captured by citation analysis or peer review. For example, download data could be used to measure actual use of one’s work. The number of hyperlinks to one’s website might also be an indication of some form of impact. Indeed, since the 1990s the fields of internet research, webometrics and scientometrics have developed a body of work comparing the roles of citations and hyperlinks and the possibility of building impact measurements on these analogies (Bar-Ilan & Peritz, 2002; Björneborn & Ingwersen, 2001; Hewson, 2003; Hine, 2005; Rousseau, 1998; Thelwall, 2005).

So, do these four arguments stand up in confrontation with the empirical results we had?

References:

Bar-Ilan, J., & Peritz, B. C. (2002). Informetric Theories and Methods for Exploring the Internet: An Analytical Survey of Recent Research Literature. Library Trends, 50(3), 371-392.

Björneborn, L., & Ingwersen, P. (2001). Perspectives of webometrics. Scientometrics, 50(1), 65-82.

Groth, P., Gibson, A., & Velterop, J. (2010). The anatomy of a nanopublication. Information Services & Use, 30, 51-56. doi:10.3233/ISU-2010-0613

Hewson, C. (2003). Internet research methods: a practical guide for the social and behavioural sciences. London etc.: Sage.

Hine, C. (2005). Virtual Methods: Issues in Social Research on the Internet. Berg.

Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2010). altmetrics: a manifesto – altmetrics.org. Retrieved January 8, 2012, from http://altmetrics.org/manifesto/

Rousseau, R. (1998). Sitations: an exploratory study. Cybermetrics, 1(1), 1. Retrieved from http://www.cindoc.csic.es/cybermetrics/articles/v1i1p1.html

Thelwall, M. (2005). Link Analysis: An Information Science Approach. San Diego: Academic Press.

“Looking-glass upon the wall, Who is fairest of us all?” (Part 2)

As indicated in the last post about our recent report on alternative impact metrics “Users, narcissism, and control”, we have tried to give an overview of 16 of novel impact measurement tools and present their strengths and weaknesses as thoroughly as we could. Many of the tools have an attractive user interface and are able to present impact results faily quickly. Moreover, almost all of them are freely available, albeit some need some form of gratis registration. All of them provide metrics at the level of the article, manuscript or book. Taken together, these three characteristics make these tools attractive to individual researchers and scholars. It enables them to quickly see statistical evidence regarding impact, usage, or influence without too much effort.

At the same time, the impact monitors still suffer from some crucial disadvantages. An important problem has to do with the underlying data. Most of the tools do not (yet?) enable the user to inspect the data on criteria such as completeness and accuracy. This means that these web based tools may create statistics and indicators on incorrect data. The second problem relates to field differences. Scientific fields differ considerably in their communication characteristics. For example, the numbers of citations in clinical research are very high because a very large number of researchers is active, the lists of references per article are relatively long, and there are many co-authored articles, sometimes with tens of authors per paper. As a result the average clinical researcher has a higher citation frequency than the average mathematician. The latter operates in much smaller communities with relatively short lists of references and many solitary articles. As a consequence, it would be irresponsible to compare the raw citation data as a proxy measure of scientific impact among units with production from very different fields.

In many evaluation contexts, it is therefore desirable to be able to normalise impact indicators. Most tools do not accomodate this. The third problem is that the data coverage is sometimes rather limited (some of the tools only look at the biomedical fields for example). The tools have some more limitations. There are almost no tools that provide metrics at other levels of aggregation such as research institutes, journals, etc. Most tools also do not provide easy ways for data downloads and data management. Although less severe than the crucial requirements, these limitations also diminish the usability of many of these tools in the more formal research assessments.

 

 

%d bloggers like this: