The new Dutch research evaluation protocol

From 2015 onwards, the societal impact of research will be a more prominent measure of success in the evaluation of research in the Netherlands. Less emphasis will be put on the number of publications, while the vigilance about research integrity will be increased. These are the main elements of the new Dutch Standard Evaluation Protocol which was published a few weeks ago.

The new protocol aims to guarantee, improve, and make visible the quality and relevance of scientific research at Dutch universities and institutes. Three aspects are central: scientific quality; societal relevance; and feasibility of the research strategy of the research groups involved. As is already the case in the current protocol, research assessments are organized by institution, and the institutional board is responsible. Nationwide comparative evaluations by discipline are possible, but the institutions involved have to agree explicitly to organize their assessments in a coordinated way to realize this. In contrast to performance based funding systems, the Dutch system does not have a tight coupling between assessment outcomes and funding for research.

This does not mean, inter alia, that research assessments in the Netherlands do not have consequences. On the contrary, these may be quite severe but they will usually be implemented by the university management with considerable leeway for interpretation of the assessment results. The main channel through which Dutch research assessments has implications is via the reputation gained or lost for the research leaders involved. The effectiveness of the assessments is often decided by the way the international committee works which performs the evaluation. If they see it as their main mission to celebrate their nice Dutch colleagues (as has happened in the recent past), the results will be complimentary but not necessarily very informative. On the other hand, they may also punish groups by using criteria that are actually not valid for those specific groups although they may be standard for the discipline as a whole (and this has also happened, for example when book-oriented groups work in a journal-oriented discipline).

The protocol does not include a uniform set of requirements or indicators. The specific mission of the research institutes or university departments under assessment is leading. As a result, research that is mainly aimed at having practical impact may be evaluated with different criteria from a group that aims to work on the international frontier of basic research. The protocol is not unified around substance but around procedure. Each group has to be evaluated every six years. A new element in the protocol is also that the scale for assessment has been changed from a five-point to a four-point scale, ranging from “unsatisfactory”, via “good” and “very good” to “excellent”. This scale will be applied to all three dimensions: scientific quality, societal relevance, and feasibility.

The considerable freedom that the peer committees have in evaluating Dutch research has been maintained in the new protocol. Therefore, it remains to be seen what the effects will be of the novel elements in the protocol. In assessing the societal relevance of research, the Dutch are following their British peers. Research groups will have to construct “narratives” which explain the impact their research has had on society, understood broadly. It is not yet clear how these narratives will be judged according to the scale. The criteria for feasibility are even less clear: according to the protocol a group has an “excellent” feasibility if it is “excellently equipped for the future”. Well, we’ll see how this works out.

With less emphasis on the amount of publications in the new protocol, the Dutch universities, the funding agency NWO and the academy of science KNAW (who collectively are reponsible for the protocol) have also responded to the increased anxiety about “perverse effects” in the research system triggered by the ‘Science in Transition’ group and to recent cases of scientific fraud. The Dutch minister of education, culture and the sciences Jet Bussemaker welcomed this change. “Productivity and speed should not be leading considerations for researchers”, she said at the reception of the new protocol. I fully agree with this statement, yet this aspect of the protocol will also have to stand the test of practice. In many ways, the number of publications is still a basic building block of scientific or scholarly careers. For example, the h-index is very popular in the medical sciences  ((Tijdink, Rijcke, Vinkers, Smulders, & Wouters, 2014). This index is a combination of the number of publications of a researcher and the citation impact of these articles in such a way that the h-index can never be higher than the total number of publications. This means that if researchers are compared according to the h-index, the most productive ones will prevail. We will have to wait and see whether the new evaluation protocol will be able to withstand this type of reward for high levels of article production.

Reference: Tijdink, J. K., Rijcke, S. De, Vinkers, C. H., Smulders, Y. M., & Wouters, P. (2014). Publicatiedrang en citatiestress. Nederlands Tijdschrift Voor Geneeskunde, 158, A7147.

Metrics in research assessment under review

This week the Higher Education Funding Council for England (HEFCE) published a call to gather “views and evidence relating to the use of metrics in research assessment and management” The council has established an international steering group which will perform an independent review of the role of metrics in research assessment. The review is supposed to contribute to the next installment of the Research Excellence Framework (REF) and will be completed Spring 2015.

Interestingly, two members of the European ACUMEN project are members of the 12 person steering group – Mike Thelwall (professor of cybermetrics at Wolverhampton University and myself – and it is led by James Wilsdon, professor of Science and Democracy at the Science Policy Research Unit (SPRU) at the University of Sussex. The London School of Economics scholar Jane Tinkler, co-author of the book The Impact of the Social Sciences, is also member and has put together some reading material on their blog So there will be ample input from the social sciences to analyze both the promises and the pitfalls of using metrics in the British research assessment procedures. The British clearly see this as an important issue. The creation of the steering group was announced by the British minister for universities and science, David Willett at the Universities UK conference on April 3 In addition to science & technology studies experts, the steering group consists of scientists from the most important stakeholders in the British science system.

At CWTS, we responded enthusiastically to the invitation by HEFCE to contribute to this work, because this approach resonates so well with the CWTS research programme The review will focus on: identifying useful metrics for research assessment; how metrics should be used in research assessment; ‘gaming’ and strategic use of metrics; and the international perspective.

All the important questions about metrics have been put on the table by the steering group, among others:

–       What empirical evidence (qualitative or quantitative) is needed for the evaluation of research, research outputs and career decisions?

–       What metric indicators are useful for the assessment of research outputs, research impacts and research environments?

–       What are the implications of the disciplinary differences in practices and norms of research culture for the use of metrics?

–       What evidence supports the use of metrics as good indicators of research quality?

–       Is there evidence for the move to more open access to the research literature to enable new metrics to be used or enhance the usefulness of existing metrics?

–       What evidence exists around the strategic behaviour of researchers, research managers and publishers responding to specific metrics?

–       Has strategic behaviour invalidated the use of metrics and/or led to unacceptable effects?

–       What are the risks that some groups within the academic community might be disproportionately disadvantaged by the use of metrics for research assessment and management?

–       What can be done to minimise ‘gaming’ and ensure the use of metrics is as objective and fit-for-purpose as possible?

The steering group also calls for evidence on these issues from other countries. If you wish to contribute evidence to the HEFCE review, please make it clear in your response whether you are responding as an individual or on behalf of a group or organisation. Responses should be sent to by noon on Monday 30 June 2014. The steering group will consider all responses received by this deadline.



%d bloggers like this: