Tales from the field: On the (not so) secret life of performance indicators

* Guest blog post by Alex Rushforth *

In the coming months Sarah De Rijcke and I have been accepted to present at conferences in Valencia and Rotterdam on research from CWTS’s nascent EPIC working group. We very much look forward to drawing on collaborative work from our ongoing ‘Impact of indicators’ project on biomedical research in University Medical Centers (UMC) in the Netherlands. One of our motivations behind the project is that there has been a wealth of social science literature in recent times about the effects of formal evaluation in public sector organisations, including universities. Yet too few studies have taken seriously the presence of indicators in the context of one of the universities core-missions: knowledge creation. Fewer still have looked to take an ethnographic lens to the dynamics of indicators in the day-to-day work context of academic knowledge. These are deficits we hope to begin addressing through these conferences and beyond.

The puzzle we will be addressing here appears – at least at first glance- straightforward enough: what is the role of bibliometric performance indicators in the biomedical knowledge production process? Yet comparing provisional findings from two contrasting case studies of research groups from the same UMC – one a molecular biology group and the other a statistics group – it becomes quickly apparent that there can be no general answer to this question. As such we aim to provide not only an inventory of different ‘roles’ of indicators in these two cases, but also to pose the more interesting analytical question of what conditions and mechanisms explain the observed variations in the roles indicators come to perform?

Owing to their persistent recurrence in the data so far, the indicators we will analyze are journal impact factor, H-index, and ‘advanced’ citation-based bibliometric indicators. It should be stressed that our focus on these particular indicators have have emerged inductively from observing first-hand the metrics that research groups attended to in their knowledge-making activities. So what have we found so far?

Dutch UMCs constitute particularly apt sites through which to explore this problem given how bibliometric assessments have been central to the formal evaluations carried-out since their inception in the early-2000s. On one level it is argued that researchers in both cases encounter such metrics as ‘governance/managerial devices’, that is, as forms of information required of them by external agencies on whom they are reliant for resources and legitimacy. Such examples can be seen when funding applications, annual performance appraisals, or job descriptions demand such information of an individual’s or group’s past performance. As the findings will show, the information needed by the two groups to produce their work effectively and the types of demands made on them by ‘external’ agencies varies considerably, despite their common location in the same UMC. This is one important reason why the role of indicators differs between cases.

However, this coercive ‘power over’ account is but one dimension of a satisfying answer to our role of indicators question. Emerging analysis reveals also the surprising discovery that in fields characterized by particularly integrated forms of coordination and standardization (Whitley, 2000)– like our molecular biologists – indicators in fact have the propensity to function as a core feature of the knowledge making process. For instance, a performance indicator like the journal impact factor was routinely mobilized informally in researchers’ decision-making as an ad hoc standard against which to evaluate the likely uses of information and resources, and in deciding whether time and resources should be spent pursuing them. By contrast in the less centralized and integrated field statistical research such an indicator was not so indispensable to routines of knowledge making activities. In the case of the statisticians it is possible to speculate that indicators are more likely to emerge intermittently as conditions to be met for gaining social and cultural acceptance by external agencies, but are less likely to inform day-to-day decisions. Through our ongoing analysis we aim to unpack further how disciplinary practices interact with organisation of Dutch UMCs to produce quite varying engagements with indicators.

The extent to which indicators play central/peripheral roles in research production processes across academic contexts is an important sociological problem to be posed in order to enhance understanding of the complex role of performance indicators in academic life. We feel much of the existing literature on evaluation of public organisations has tended to paint an exaggerated picture of formal evaluation and research metrics as synonymous with empty ritual and legitimacy (e.g. Dahler-Larsen, 2012). Emerging results here show that – at least in the realm of knowledge production- the picture is more subtle. This theoretical insight will prompt us to suggest further empirical studies are needed of scholarly fields with different patterns of work organisation in order to compare our results and develop middle-range theorizing on the mechanisms through which metrics infiltrate knowledge production processes to fundamental or peripheral degrees. In future this could mean venturing into fields far outside of biomedicine, such as history, literature, or sociology. For now though we look forward to expanding the biomedical project, by conducting analogous case studies from a second UMC.

Indeed it is through such theoretical developments that we can consider not only the appropriateness of one-size-fits-all models of performance evaluation, but also unpack and problematize discourses about what constitutes ‘misuse’ of metrics. And indeed how convinced should we be that academic life is now saturated and dominated by deleterious metric indicators? 


DAHLER-LARSEN, P. 2012. The evaluation society, Stanford, California, Stanford Business Books, an imprint of Stanford University Press.

 WHITLEY, R. 2000. The intellectual and social organization of the sciences, Oxford England ; New York, Oxford University Press.

“Looking-glass upon the wall, Who is fairest of us all?” (Part 2)

As indicated in the last post about our recent report on alternative impact metrics “Users, narcissism, and control”, we have tried to give an overview of 16 of novel impact measurement tools and present their strengths and weaknesses as thoroughly as we could. Many of the tools have an attractive user interface and are able to present impact results faily quickly. Moreover, almost all of them are freely available, albeit some need some form of gratis registration. All of them provide metrics at the level of the article, manuscript or book. Taken together, these three characteristics make these tools attractive to individual researchers and scholars. It enables them to quickly see statistical evidence regarding impact, usage, or influence without too much effort.

At the same time, the impact monitors still suffer from some crucial disadvantages. An important problem has to do with the underlying data. Most of the tools do not (yet?) enable the user to inspect the data on criteria such as completeness and accuracy. This means that these web based tools may create statistics and indicators on incorrect data. The second problem relates to field differences. Scientific fields differ considerably in their communication characteristics. For example, the numbers of citations in clinical research are very high because a very large number of researchers is active, the lists of references per article are relatively long, and there are many co-authored articles, sometimes with tens of authors per paper. As a result the average clinical researcher has a higher citation frequency than the average mathematician. The latter operates in much smaller communities with relatively short lists of references and many solitary articles. As a consequence, it would be irresponsible to compare the raw citation data as a proxy measure of scientific impact among units with production from very different fields.

In many evaluation contexts, it is therefore desirable to be able to normalise impact indicators. Most tools do not accomodate this. The third problem is that the data coverage is sometimes rather limited (some of the tools only look at the biomedical fields for example). The tools have some more limitations. There are almost no tools that provide metrics at other levels of aggregation such as research institutes, journals, etc. Most tools also do not provide easy ways for data downloads and data management. Although less severe than the crucial requirements, these limitations also diminish the usability of many of these tools in the more formal research assessments.



%d bloggers like this: