Indicator-considerations eclipse other judgments on the shop-floor | Keynote Sarah de Rijcke ESA Prague, 26 August 2015

This invited lecture at the ESA conference in Prague drew on insights from the Leiden Manifesto and from two recent research projects at our institute in the Evaluation Practices in Context research group. These research projects show how indicators influence knowledge production in the life sciences and social sciences, and how in- and exclusion mechanisms get built into the scientific system through certain uses of evaluative metrics. Our findings point to a rather self-referential focus on metrics and a lack of space for responsible, relevant research in the scientific practices under study. On the basis of these findings I argued in the talk that we need an alternative moral discourse in research assessment, centered around the need to address growing inequalities in the science system. Secondly, the talk considered the most pertinent issues for the community of sociologists from the Leiden Manifesto for research metrics (Hicks, Wouters, Waltman, De Rijcke & Rafols, Nature, 23 April 2015).

http://www.slideshare.net/sarahderijcke/slideshelf

See also:

Rushforth & De Rijcke (2015). Accounting for Impact? The Journal Impact Factor and the making of biomedical research in the NetherlandsMinerva, 53(2), 117-139.

De Rijcke, S. & Rushforth, A.D. (2015). To intervene, or not to intervene, is that the question? On the role of scientometrics in research evaluation. Journal of the Association for Information Science and Technology, 66 (9), 1954-1958.

Hicks, D., Wouters, P.F., Rafols, I., De Rijcke, S. & Waltman, L. (2015). The Leiden Manifesto for Research Metrics. Nature, 23 April 2015.

Hammarfelt & De Rijcke (2015). Accountability in Context: Effects of research evaluation systems on publication practices, disciplinary norms, and individual working routines in the faculty of Arts at Uppsala UniversityResearch Evaluation, 24(1), 63-77.

Advertisements

Quality in the age of the impact factor

ISIS, the most prestigious journal in the history of science, moved house last September and its central office is now located at the Descartes Centre for the History and Philosophy of the Sciences and Humanities at Utrecht University. The Dutch science historian H. Floris Cohen took up the position of the editor in chief of the journal. No doubt this underlines the international reputation of the community of historians of science in the Netherlands. Being the editor of the central journal in ones field surely is mark of esteem and quality.

The opening of the editorial office in Utrecht was celebrated with a symposium entitled “Quality in the age of the impact factor”. Since quality of research in history is intimately intertwined with the quality of writing, it seemed particularly apt to call attention to the role of impact factors in humanities fields. I used the occasion to pose the question how we actually define scientific and scholarly quality. How do we recognize quality in our daily practices? And how can this variety of practices be understood theoretically? Which approaches in the field of science and technology studies are most relevant?

In the same month, Pleun van Arensbergen graduated on a very interesting PhD dissertation which dealt with some of the issues, “Talent Proof. Selection Processes in Research Funding and Careers”. Van Arensbergen did her thesis work at the Rathenau Institute in The Hague. The quality of research is increasingly seen as mainly the result of the quality of the people involved. Hence, universities “have openly made it one of their main goals to attract scientific talent” (van Arensbergen, 2014, p. 121). A specific characteristics of this “war for talent” in the academic world is that there is an oversupply of talents and a relative lack of career opportunities, leading to a “war between talents”. The dissertation is a thorough analysis of success factors in academic careers. It is an empirical analysis of how the Dutch science foundation NWO selects early career talent in its Innovational Research Incentives Scheme. The study surveyed researchers about their definitions of quality and talent. It combines this with an analysis of both the outcome and the process of this talent selection. Van Arensbergen paid specific attention to the gender distribution and to the difference between successful and unsuccessful applicants.

Her results point to a discrepancy between the common notion among researchers that talent is immediately recognizable (“you know it when you see it”) and the fact that there are very small differences between candidates that get funded and those that do not. The top and the bottom of the distribution of quality among proposals and candidates are relatively easy to detect. But the group of “good” and “very good” proposals is still too large to be funded. Van Arensbergen and her colleagues did not find a “natural threshold” above which the successful talents can be placed. On the contrary, in one of her chapters they find that researchers who leave the academic system due to lack of career possibilities regularly score higher on a number of quality indicators than those who are able to continue a research career. “This study does not confirm that the university system always preserves the highly productive researchers, as leavers were even found to outperform the stayers in the final career phase (van Arensbergen, 2014, p. 125).

Based on the survey, her case studies and her interviews, Van Arensbergen also concludes that productivity and publication records have become rather important for academic careers. “Quality nowadays seems to a large extent to be defined as productivity. Universities seem to have internalized the performance culture and rhetoric to such an extent that academics even define and regulate themselves in terms of dominant performance indicators like numbers of publications, citations or the H-index. (…) Publishing seems to have become the goal of academic labour.” (van Arensbergen, 2014, p. 125). This does not mean, however, that these indicators determine the success of a career. The study questions “the overpowering significance assigned to these performance measures in the debate, as they were not found to be entirely decisive.” (van Arensbergen, 2014, p. 126) An extensive publication record is a condition but not a guarantee for success.

This relates to another finding: the group process of panel discussions are also very important. With a variety of examples, Van Arensbergen shows how the organization of the selection process shapes the outcome. The face to face interview of the candidate with the panel is for example crucial for the final decision. In addition, the influence of the external peer reports was found to be modest.

A third finding in the talent dissertation is that success in obtaining grants feeds back into ones scientific and scholarly career. This creates a self reinforcing mechanism, which the science historian Robert Merton coined the Matthew effect after the quote from the bible: “For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken even that which he hath.” (Merton, 1968). Van Arensbergen concludes that this means that differences between scholars may initially be small but will increase in the course of time as a result of funding decisions. “Panel decisions convert minor differences in quality into enlarged differences in recognition.”

Combining these three findings leads to some interesting conclusions regarding how we actually define and shape quality in academia. Although panel decisions about who to fund are strongly shaped by the organization of the selection process as well as by a host of other contextual factors (including chance), and although all researchers are aware of the uncertainties in these decisions, this does not mean that these decisions are given less weight. On the contrary, obtaining external grants has become a cornerstone for successful academic careers. Universities even devote considerable resources to make their researchers abler to acquire prestigious grants as well as external funding in general. Although this is clearly instrumental for the organization, Van Arensbergen thinks that grants have become part of the symbolic capital of a researcher and research group and she refers to Pierre Bourdieu’s theory of symbolic capital to better understand the implications.

This brings me to my short lecture at the opening of the editorial office of ISIS in Utrecht. Although the experts on bibliometric indicators don’t generally see the Journal Impact Factor as an indicator of quality, socially it seems to partly function like it. But indicators are not alone in shaping how we in practice identify, and thereby define, talent and quality. They flow together with the way quality assurance and measurement processes are organized, the social psychology of panel discussions, the extent to which researchers are visible in their networks, etc. In these complex contextual interactions, indicators do not determine but they are ascribed meaning dependent on the situation in which the researchers find themselves. A good way to think about this, in my view, is developed in the field of material semiotics. This approach which has its roots in the French actor network theory of Bruno Latour and Michel Callon, does not accept a fundamental rupture in reality between the material and the symbolic. Reality as such is the result of complex and interacting translation processes. This is an excellent philosophical basis to understand how scientific and scholarly quality emerge. I see quality not as an attribute of an academic persona or of a particular piece of work, but as the result of the interaction between a researcher (or a manuscript) and the already existing scientific or scholarly infrastructure (eg. the body of published studies). If this interaction creates a productive friction (meaning that there is enough novelty in the contribution but not so much that it is incompatible with the already existing body of work), we see the work or scholar as of high quality. In other words, quality does simply not (yet) exist outside of the systems of quality measurement. The implication of this is that quality itself is a historical category. It is not an invariant but a culturally and historically specific concept that changes and morphes over time. In fact, the history of science is the history of quality. I hope historians of science will take up the challenge to map this history in more empirical and theoretical sophistication than has been done so far.

Literature:

Merton, R. K. (1968). The Matthew Effect in Science. Science, 159, 56–62.

Van Arensbergen, P. (2014). Talent proof : selection processes in research funding and careers. The Hague, Netherlands: Rathenau Institute. Retrieved from http://www.worldcat.org/title/talent-proof-selection-processes-in-research-funding-and-careers/oclc/890766139&referer=brief_results

 

On exploding ‘evaluation machines’ and the construction of alt-metrics

The emergence of web-based ways to create and communicate new knowledge is affecting long-established scientific and scholarly research practices (cf. Borgman 2007; Wouters, Beaulieu, Scharnhorst, & Wyatt 2013). This move to the web is spawning a need for tools to track and measure a wide range of online communication forms and outputs. By now, there is a large differentiation in the kinds of social web tools (i.e. Mendeley, F1000,  Impact Story) and in the outputs they track (i.e. code, datasets, nanopublications, blogs). The expectations surrounding the explosion of tools and big ‘alt-metric’ data (Priem et al. 2010; Wouters & Costas 2012) marshal resources at various scales and gather highly diverse groups in pursuing new projects (cf. Brown & Michael 2003; Borup et al. 2006 in Beaulieu, de Rijcke & Van Heur 2013).

Today we submitted an abstract for a contribution to Big Data? Qualitative approaches to digital research (edited by Martin Hand & Sam Hillyard and contracted with Emerald). In the abstract we propose to zoom in on a specific set of expectations around altmetrics: Their alleged usefulness for research evaluation. Of particular interest to this volume is how altmetrics information is expected to enable a more comprehensive assessment of 1. social scientific outputs (under-represented in citation databases) and 2. wider types of output associated with societal relevance (not covered in citation analysis and allegedly more prevalent in the social sciences).

Our chapter we address a number of these expectations by analyzing 1) the discourse in the “altmetrics movement”, the expectations and promises formulated by key actors involved in “big data” (including commercial entities); and 2) the construction of these altmetric data and their alleged validity for research evaluation purposes. We will combine discourse analysis with bibliometric, webometric and altmetric methods in which both methods will also interrogate each others’ assumptions (Hicks & Potter 1991).

Our contribution will show, first of all, that altmetric data do not simply ‘represent’ other types of outputs; they also actively create a need for these types of information. These needs will have to be aligned with existing accountability regimes. Secondly, we will argue that researchers will develop forms of regulation that will partly be shaped by these new types of altmetric information. They are not passive recipients of research evaluation but play an active role in assessment contexts (cf. Aksnes & Rip 2009; Van Noorden 2010). Thirdly, we will show that the emergence of altmetric data for evaluation is another instance (following the creation of the citation indexes and the use of web data in assessments) of transposing traces of communication into a framework of evaluation and assessment (Dahler-Larsen 2012, 2013; Wouters 2014).

By making explicit what the implications are of the transfer of altmetric data from the framework of the communication of science to the framework of research evaluation, we aim to contribute to a better understanding of the complex dynamics in which new generation of researchers will have to work and be creative.

Aksnes, D. W., & Rip, A. (2009). Researchers’ perceptions of citations. Research Policy, 38(6), 895–905.

Beaulieu, A., van Heur, B. & de Rijcke, S. (2013). Authority and Expertise in New Sites of Knowledge Production. In A. Beaulieu, A. Scharnhorst, P. Wouters and S. Wyatt (Eds.), Virtual KnowledgeExperimenting in the Humanities and the Social Sciences. (pp. 25-56). MIT Press.

Borup, M, Brown, N., Konrad, K. & van Lente, H. 2006. “The sociology of expectations in science and technology.” Technology Analysis & Strategic Management 18 (3/4), 285-98.

Brown, N. & Michael, M. (2003). “A sociology of expectations: Retrospecting prospects and prospecting retrospects.” Technology Analysis & Strategic Management 15 (1), 3-18.

Costas, R., Zahedi, Z. & Wouters, P. (n.d.). Do ‘altmetrics’ correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective.

Dahler-Larsen, P. (2012). The Evaluation Society. Stanford University Press.

Dahler-Larsen, P. (2013). Constitutive Effects of Performance Indicators. Public Management Review, (May), 1–18.

Galligan, F., & Dyas-Correia, S. (2013). Altmetrics: Rethinking the Way We Measure. Serials Review, 39(1), 56–61.

Hicks, D., & Potter, J. (1991). Sociology of Scientific Knowledge: A Reflexive Citation Analysis of Science Disciplines and Disciplining Science. Social Studies of Science, 21(3), 459 –501.

Priem, J., Taraborelli, D., Groth, P., and Neylon, C. (2010a). Altmetrics: a manifesto. http://altmetrics.org/manifesto/

Van Noorden, R. (2010) “Metrics: A Profusion of Measures.” Nature, 465, 864–866.

Wouters, P., Costas, R. (2012). Users, narcissism and control: Tracking the impact of scholarly publications in the 21st century. Utrecht: SURF foundation.

Wouters, P. (2014). The Citation: From Culture to Infrastructure. In B. Cronin & C. R. Sugimoto (Eds.), Next Generation Metrics: Harnessing Multidimensional Indicators Of Scholarly Performance (Vol. 22, pp. 48–66). MIT Press.

Wouters, P., Beaulieu, A., Scharnhorst, A., & Wyatt, S. (eds.) (2013). Virtual Knowledge – Experimenting in the Humanities and the Social Sciences. MIT Press.

Book release

Today we are witnessing dramatic changes in the way scientific and scholarly knowledge is created, codified, and communicated. This transformation is connected to the use of digital technologies and the virtualization of knowledge. In this book, scholars from a range of disciplines consider just what, if anything, is new when knowledge is produced in new ways. Does knowledge itself change when the tools of knowledge acquisition, representation, and distribution become digital? Issues of knowledge creation and dissemination go beyond the development and use of new computational tools. The book, which draws on work from the Virtual Knowledge Studio, brings together research on scientific practice, infrastructure, and technology. Focusing on issues of digital scholarship in the humanities and social sciences, the contributors discuss who can be considered legitimate knowledge creators, the value of “invisible” labor, the role of data visualization in policy making, the visualization of uncertainty, the conceptualization of openness in scholarly communication, data floods in the social sciences, and how expectations about future research shape research practices. The contributors combine an appreciation of the transformative power of the virtual with a commitment to the empirical study of practice and use.

Edited by Paul Wouters, Anne Beaulieu, Andrea Scharnhorst and Sally Wyatt.

Why do neoliberal universities play the numbers game?

Performance measurement has brought on a crisis in academia. At least, that’s what Roger Burrows (Goldsmiths, University of London) claims in a recent article for The Sociological Review. According to Burrows, academics are at great risk of becoming overwhelmed by a ‘deep, affective, somatic crisis’. This crisis is brought on by the ‘cultural flattening of market economic imperatives’ that fires up increasingly convoluted systems of measure. Burrows places this emergence of quantified control in academia within the broader context of neoliberalism. Though this has been argued before, Burrows gives the discussion a theoretical twist. He does so by drawing on Gane’s (2012) analysis of Foucault’s (1978-1979) lectures on the relation between market and state under neoliberalism. According to Foucault, neoliberal states can only guarantee the freedom of markets when they apply the same ‘market logic’ on themselves. In this view, the standard depiction of neoliberalism as passive statecraft is not correct. This type of management is not ‘laissez-faire’, but actively stimulates competition and privatization strategies.

In the UK, Burrows contends, the simulation of neoliberal markets in academia has largely been channelled through the introduction of audit and of performance measures. He argues that these control mechanisms become autonomous entities that are increasingly used outside the original context of evaluations, and get a much more active role in shaping the everyday work of academics. According to Burrows, neoliberal universities provide fertile ground for a “co-construction of statistical metrics and social practices within the academy.” Among other things, this leads to a reification of individual performance measures such as the H-index. Burrows:

“[I]t is not the conceptualization, reliability, validity or any other set of methodological concerns that really matter. The index has become reified; (…) a number that has become a rhetorical device with which the neoliberal academy has come to enact ‘academic value’.” (p. 361)

Interestingly, Burrow’s line of reasoning can in some respects itself be seen as a resultant of a broader neoliberal context. Neoliberal policies applaud personal autonomy and the individual’s responsibility for one’s own well-being and professional success. Burrows directly addresses fellow-academics (‘we need to obtain critical distance’; ‘we need to understand ourselves as academics’; ‘why do we feel the way we do?’) and concludes that we are all implicated in the ‘autonomization of metric assemblages’ in the academy. Arguably, it is exactly this neoliberal political climate that justifies Burrows’ focus on individual academics’ affective states. With it comes a delegation of responsibility to the level of the individual researchers. It is our own choice if we comply with the metricization of academia. It is our own choice if we decide to work long hours, spend our weekends writing grant proposals and articles and grading students’ exams. According to Gill (2010), academics tend to justify working so hard because they possess a passionate drive for self-expression and pleasure in intellectual work. Paradoxically, Gill argues, it is this drive that feeds a whole range of disciplinary mechanisms and that lets academics internalize a neoliberal subjectivity. We play ‘the numbers game’, as Burrows calls it, because of “a deep love for the ‘myth’ of what we thought being an intellectual would be like.” (p. 15)

Though Burrows raises concerns that are shared by many academics, it is unfortunate that he does not substantiate his claims with empirical data. Apart from own experience and anecdotal evidence, how do we know that today’s researchers experience the metricization of academia as a ‘deep, affective somatic crisis’? Does it apply to all researchers, is it the same everywhere, and does it hold for all disciplines? These are empirical questions that Burrows does not answer. That said, there is a great need for the types of analyses Burrows and Gill provide, analyses that assess, situate and historicize academic audit cultures. It is not a coincidence that Burrows’ polemic piece emerges from the field of sociology. The social sciences and humanities are increasingly confronted with what Burrows calls the ‘rethoric of accountability’. It has become a commonplace to argue that they, too, should be held accountable for the taxpayers’ money that is being spent on them. These disciplines, too, should be made auditable by way of standardized, transparent performance measures. I agree with Burrows that this rethoric should be problematized. In large parts of these fields it is not at all clear how performance should be ‘measured’ in the first place, for example because of differences in publication cultures within these fields and as compared to the natural sciences. And it is precisely because the discussion is ongoing that we are allowed a clear view of the performative effects of a very specific and increasingly dominant evaluation culture that is not modelled by and on these disciplines. What are the consequences? And are there more constructive alternatives?

Prospects of humanities bibliometrics? – Part 2

In the empirical chapters of this PhD thesis Following the Footnotes,  Björn Hammarfelt tries out several methodological innovations. In a particularly interesting chapter, Hammarfelt traces the citations to one book, Illuminations by Walter Benjamin, and he tries to map the disciplines that cite the book. But he goes further and even teases out the specific parts of the book that are cited. He calls it Page Citation Analysis which basically is a form of Citation Context Analysis as proposed by Suzan Cozzens (1985). Alternatively, one could see this as going back to the very old philological tradition which was also engaged with the precise location of particular textual phenomena. Historically, bibliometrics is humanities research. In a preceding chapter, Hammarfelt analyzes the intellectual base of literary studies by analyzing the references from 34 literature journals. The way he collected these journals is interesting because he departed from the point of view of the literary researcher (and not the available database, a mistake often made in bibliometrics). He advocates the mixed use of citation analysis and library classifications, also an issue underdeveloped in the field of bibliometrics. Another interesting innovation is in the last empirical chapter where Hammarfelt analyzes research grants and compares them with journal publications. He argues that research grants are increasingly important in the life of a scholar and hence it would make sense to use them much more as a data source. He also notes the partly different role of references in grants compared to journal articles.

The various analyses in Hammarfelts thesis confirm what we generally know about the humanities as a set of disciplines. He tries to characterize the nature of the humanities by drawing upon theories from the sociology of science, in particular the theories that try to explain the social structure as well as the intellectual structure of working practices and communication patterns in these fields. His theoretical exposé focuses on the theories proposed by Richard Whitley and Becher and Trowler, but he also uses various theories that can be captured under the umbrella of “mode 2 knowledge production”.

Hammarfelt’s thesis confirms that the humanities can be characterized as divergent (rather than convergent), as very wide ranging and interdisciplinary (proven by the reference analysis performed by Hammarfelt), as rural in the sense of sparsely populated villages of researchers focusing on a particular topic (rather than busy laboratories that are more like bustling cities), and as fragmented (scholars are quite independent of each other which gives them a lot of freedom, but also makes it more difficult to speak with a common voice regarding resources). The interdisciplinary nature of literary studies seems to be rising, and Hammarfelt even detects a turn to the social, because he sees more connections between the humanities and specific fields in the social sciences such as gender studies, and post-colonial studies. So there might be a social turn after the linguistic turn some decades ago. Overall, the thesis concludes that three features are most important to understand referencing and citations in the humanities: the strong independence of humanities scholars; the rural organization; and the diverse audiences of the fields. The latter goes together with a rather low codification of the literature because it needs to be understood by a wide range of people.

What does this mean for the ways humanities scholars are being assessed in evaluation protocols? This topic is discussed in a variety of ways in the thesis. In the last part, Hammarfelt brings it together with a sketch of how bibliometrics could further develop to be of use in this more policical sense. He advocates a role here, because bibliometrics can help unsettle existing power structures, like the lack of diversity in the universities. He specifically mentions the gender bias, the need to support interdisciplinary research, and the problems with the current peer review systems. But bibliometrics needs to change in order to play this role.

Reference:

Cozzens, S. E. (1985). Comparing the Sciences: Citation Context Analysis of Papers from Neuropharmacology and the Sociology of Science. Social Studies of Science, 15, 127-153.

Prospects of humanities bibliometrics? – Part 1

Humanities scholars are often confronted with the limitations of citation analysis in their fields. Often, the number of articles is too low to do any meaningful statistical analysis of publication or citation patterns. Moreover, many forms of publishing in the humanities (books, national journals, movies, dance performances etc.) are not covered in the bibliometric databases. As a result, both humanities scholars and bibliometricians often advice against using citation analysis in most fields in the humanities, in particular for evaluation.

So does this mean that a form of bibliometrics better fitted to the humanities is impossible? Not so, says Björn Hammarfelt, who recently received his PhD degree at Uppsala University for this thesis “Following the Footnotes”.

In this well written and innovate thesis, Hammarfelt combines three different intellectual, and I would say also political, interests: literary studies, bibliometrics, and the sociology of science. His title “Following the Footnotes” has three different connotations. First, he tries to trace the creation, role and institutionalization of the reference, a specific element of a scholarly publication. Examples of references are footnotes, but one can also mention a document by putting the author’s name and the publication year in brackets and then summarize these documents at the end of the article, and one can also have references as endnotes. Hammarfelt argues that these seemingly technical differences have an important meaning and that we should pay careful attention to their format and position in the text. He also claims that reference practices in literary studies and in the humanities more generally have a partly different character than these practices in the natural sciences because the act of writing has a different role. To put it a bit bluntly, in the natural sciences an author is supposed to report “facts of nature” and obliterate herself from the text, in the humanities a scholar is expressing herself as a creative persona, and very visibly so. Both are rhetorical strategies, neither of them is a simply reflection of the reality of scholarly practice.

In the second meaning of the title, Hammarfelt tries to give us a glimpse of what will follow the footnote now that the landscape of scholarly publishing is developing so fast. Although his thesis does not focus on new forms of referencing (such as in Facebook, Twitter, or Spotify), it does give us insight in the current practices on which the future will build. The third meaning of the title is more fully developed: what follows the footnote if these are translated into citations and into academic reputation for the author. Here the thesis deals with the very important topic of research evaluation and how the humanities are currently subjected to regimes of evaluation and evaluation cultures that do not always seem to be aware of the specific epistemic and social characteristics of the humanities. In the concluding chapter of the thesis, Hammarfelt comes back to this with a number of suggestions regarding the application of bibliometrics in the evaluation of the quality and impact of humanities research. His thesis is an exercise in how a humanities bibliometrics might look like and what methodological and theoretical issues are important to develop a humanist oriented bibliometrics.

%d bloggers like this: