The Leiden manifesto in the making: proposal of a set of principles on the use of assessment metrics in the S&T indicators conference

Summary

A set of guiding principles (a manifesto) on the use of quantitative metrics in research assessment was proposed by Diana Hicks (Georgia Tech) during a panel session on quality standards for S&T indicators at the STI conference in Leiden last week. Various participants in the debate agreed on the responsibility of the scientometric community in better supporting use of scientometrics. Finding the choice of specific indicators too constraining, many voices supported the idea of a joint publication of a set of principles which should guide a responsible use of quantitative metrics. The session also included calls for scientometricians to take a more proactive role as engaged and responsible stakeholders in the development and monitoring of metrics for research assessment, as well as in wider debates on data governance of, such as infrastructure and ownership.

In the closure of the conference, the association of scientometric institutes ENID (European Network of Indicators Designers) and Ton van Raan as president, offered to play a coordinating role in writing up and publishing a consensus version of the manifesto.

Full report of the plenary session at the 2014 STI conference in Leiden on Quality standards for evaluation: Any chance of a dream come true?

The need to debate these issues has come to the forefront in light of reports that uses of certain easy-to-use and potentially misleading metrics for evaluative purposes have become a routine part of academic life, despite misgivings within the profession itself about its validity. A central aim of the special session was to discuss the need for a concerted response from the scientometric community to produce more explicit guidelines and expert advice on good scientometric practices. The session continued from the 2013 ISSI and STI conferences in Vienna and Berlin, where full plenary sessions were convened on the need for standards in evaluative bibliometrics, and the ethical and policy implications of individual-level bibliometrics.

This year’s plenary session started with a summary by Ludo Waltman (CWTS) of the pre-conference workshop on technical aspects of advanced bibliometric indicators. The workshop, co-organised by Ludo, was attended by some 25 participants, and topics that were addressed included 1. Advanced bibliometric indicators (strengths and weaknesses of different types of indicators; field normalization; country-level and institutional-level comparisons); 2. Statistical inference in bibliometric analysis; and 3. Journal impact metrics (strenghts and weaknessess of different journal impact metrics; use of the metrics in the assessment of individual researchers). The workshop discussions were very fruitful and some common ground was found, but that there also remained significant differences in opinion. Some topics that need further discussion are technical and mathematical properties of indicators (e.g., ranking consistency); strong correlations between indicators; the need to distinguish between technical issues and usage issues; purely descriptive approaches vs. statistical approaches, and the importance of user perspectives for technical aspects of indicator production. There was a clear interest in continuing these discussions at a next conference. The slides of the workshop are available on request.

Ludo’s summary was followed by a short talk by Sarah de Rijcke (CWTS), to set the scene for the ensuing panel discussion. Sarah provided an historical explanation for why previous responses by the scientometric community about misuses of performance metrics and the need for standards have landed in deaf ears. Evoking Paul Wouters’ and Peter Dahler-Larsen’s introductory and keynote lectures, she argued that the preferred normative position of scientometrics (‘We measure, you decide’) and the tendency to provide upstream solutions no longer serve the double role of the field very well. As an academic as well as a regulatory discipline, scientometrics not only creates reliable knowledge on metrics, but also produces social technologies for research governance. As such, evaluative metrics attain meaning in a certain context, and they also help shape that context. Though parts of the community now acknowledge that there is indeed a ‘social’ problem, ethical issues are often either conveniently bracketed off or ascribed to ‘users lacking knowledge’. This reveals unease with taking any other-than-technical responsibility. Sarah plugged the idea of a short joint statement on proper uses of evaluative metrics, proposed at the international workshop at OST in Paris (12 May 2014). She concluded with a plea for a more long-term reconsideration of the field’s normative position. If the world of research governance is indeed a collective responsibility, then scientometrics should step up and accept its part. This would put the community in a much better position to actually engage productively with stakeholders in the process of developing good practices.

In the ensuing panel discussion, Stephen Curry (professor of Structural Biology at Imperial College, London, and member of HEFCE steering group) expressed a deep concern about the seducing power of metrics in research assessment and saw a shared, collective responsibility for the creation and use of metrics on the side of bibliometricians, researchers and publishers alike. Thus according to him technical and usage aspects of indicators should not be separated artificially.

Lisa Colledge (representing Elsevier as Snowballmetrics project director) talked about the Snowballmetrics initiative, and presented it as a bottom-up and practical approach with the goal to meet the needs of funding organizations and university senior level management. According to Lisa, while it primarily addresses research officers, feedback from the academic community of bibliometrics is highly appreciated to contribute to the empowerment of indicator users.

Stephanie Haustein (University of Montreal) was not convinced that social media metrics (a.k.a. altmetrics) lend itself to standardization due to heterogeneity of data sources (tweets, views, downloads) and their constantly changing nature. She stated that meaning of altmetrics data is highly ambiguous (attention vs. significance) and a quality control similar to the peer review system in scientific publications does not yet exist.

Jonathan Adams (Chief scientist at Digital Science) approved the idea of setting up a statement but emphasized that it would have to be short, precise and clear to also catch the attention of government bodies, funding agencies and senior level university management who are uninterested in technical details. Standards will have to live up to the fast-paced change (data availability, technological innovations). He was critical of any fixed set of indicators since this would not accommodate the strategic interests of every organization.

Diana Hicks (Georgia Institute of Technology) presented a first draft of a set of statements (the “Leiden Manifesto”), which she proposed should be published in a top-tier journal like Nature or Science. The statements are general principles on how scientometric indicators should be used, such as for example, ‘Metrics properly used support assessments; they do not substitute for judgment’ or ‘Metrics should align with strategic goals’.

In the ensuing debate, many participants in the audience proposed initiatives and problems that need to be solved. They were partially summarized by Paul Wouters who identified four issues around which the debate evolved. First, he proposed that a central issue is the connection between assessment procedures and the primary process of knowledge creation. If this connection is severed, assessments lose part of their usefulness for researchers and scholars.

The second question is what kind of standards are desirable. Who sets them? How open are they to new developments and different stakeholders? How comprehensive and transparent are or should standards be? What interests and assumptions are included within them? In the debate it became clear that scientometricians do not want to determine the standards themselves. Yet standards are being developed by database providers and universities, now busy building up new research information systems. Wouters proposed that the scientometric community sets as its goal to monitor and analyze evolving standards. This could help to better understand problems and pitfalls and also provide technical documentation.

The third issue highlighted by Wouters is the question of who is responsible. While the scientometric community cannot assume full responsibility for all evaluations in which scientometric data and indicators play a role, it can certainly broaden out its agenda. Perhaps an even more fundamental question is how public stakeholders can remain in control of the responsibility for publicly funded science when more and more meta-data is being privatized. Wouters pleaded for strengthening the public nature of the infrastructure of meta-data, including current research information systems, publication databases and citation indexes. This view does not deny the important role for for-profit companies who are often more innovative. Fourth, Wouters suggested that taking these issues together provides an inspiring collective research agenda for the scientometrics community.

Diana Hicks’ suggestion of a manifesto or set of principles was followed up on the second day of the STI conference at the annual meeting of ENID (European Network of Indicators Designers). The ENID assembly, and Ton van Raan as president, offered to play a coordinating role in writing up the statement. Diana Hicks’ draft will serve as a basis, and it will also be informed by opinions from the community, important stakeholders and intermediary organisations, as well as those affected by evaluations. The debate on standardization and use will be continued in upcoming science policy conferences, with a session confirmed for the AAAS (San José, February) and expected sessions in the STI and ISSI conferences in 2015.

(Thanks to Sabrina Petersohn for sharing her notes of the debate.)

Ismael Rafols (Ingenio (CSIC-UPV) & SPRU (Sussex); Session chair); Sarah de Rijcke (CWTS, Leiden University); Paul Wouters (CWTS, Leiden University)

A key challenge: the evaluation gap

What are the best ways to evaluate research? This question has received renewed interest in both the United Kingdom and the Netherlands. The dynamics in research, the increased availability of data about research, as well as the rise of the web as infrastructure for research leads to regular revisiting this question. The UK funding agency HEFCE has installed a Steering Group to evaluate the evidence on the potential role of performance metrics in the next instalment of the Research Excellence Framework. The British ministry of science suspects that metrics may help to alleviate the pressure of the large-scale assessment exercise on the research community. In the Netherlands, a new Standard Evaluation Protocol has been published in which the number of publications is no longer an independent criterion for the performance of research groups. Like the British, the Dutch are putting much more emphasis on societal relevance than in the previous assessment protocols. However, whereas the British are exploring new ways to make the best use of metrics, the Dutch pressure group Science in Transition is calling for an end to the bibliometric measurement of research performance.

In 2010 (three years before Science in Transition was launched), we started to formulate our new CWTS research programme and we took a different perspective to research evaluation. We defined the key issue not as a problem of too many or too little indicators. Neither do we think that using the wrong indicators (either bibliometric or peer review based) is the fundamental problem, although misinterpretation or misuse of indicators certainly does happen. The most important issue is the emergence of a more fundamental gap between on the one hand the dominant criteria in scientific quality control (in peer review as well as in metrics approaches), and on the other hand the new roles of research in society. This was the main point of departure for the ACUMEN project, which aimed to address this gap (and has recently delivered its report to the European Commission).

This “evaluation gap” results in discrepancies at two levels. First, research has a variety of missions: to produce knowledge for its own sake; to help define and solve economic and social problems; to create the knowledge base for further technological and social innovation; and to give meaning to actual cultural and social developments. These different missions are strongly interrelated and can often be served within one research project. Yet, they do require different forms of communication and articulation work. The work needed to accomplish these missions is certainly not limited to the publication of articles in specialized scientific journals. Yet, it is this type of work that figures most prominently in research evaluations. This has the paradoxical effect that the requirements to be more active in “valorization” and other forms of society-oriented scientific work is piled on top of the requirement to be excellent in publishing high impact articles and books. No wonder a lot of Dutch researchers regularly show signs of burn out (Tijdink, Rijcke, Vinkers, Smulders, & Wouters, 2014; Tijdink, Vergouwen, & Smulders, 2012). Hence, there is a need for diversification of quality criteria and a more refined set of evaluation criteria that take into account the real research mission of the group or institute that is being evaluated (instead of an ideal-typical research mission that is actually not much more than a pipe dream).

Second, research has become a huge enterprise with enormous amounts of research results and an increased complexity of interdisciplinary connections between fields. The current routines in peer review cannot keep up with this vast increase in scale and complexity. Sometimes there is a lack of sufficient numbers of peers to check the quality of the new research. In addition, new forms of peer review of data quality are in increasing demand. A number of experiments with new forms of review to address these issues have been developed in response to these challenges. A common solution in massive review exercises (such as the REF in the UK or the judgement of large EU programmes) is the bucreaucratization of peer review. This effectively turns the substantive orientation of peer expert judgment into a procedure in which the main role of experts is ticking boxes and checking whether the researchers have fulfilled their procedural requirements. Will this in the long run undermine the nature of peer review in science? We do not really know.

A possible way forward would be to re-assess the balance between qualitative and quantitative judgement of quality and impact. The fact that the management of large scientific organizations require lots of empirical evidence and therefore also quantitative indicators, does not mean that these indicators should inevitably be leading. The fact that the increased social significance of scientific and scholarly research means that researchers should be evaluated, does not mean that evaluation should always be a formalized procedure in which the researchers participate only willy-nilly. According to the Danish economist Peter Dahler-Larsen, the key characteristic of “the evaluation society” is that evaluation has become a profession in itself and has become detached from the primary process that it evaluates (Dahler-Larsen, 2012). We are dealing with “evaluation machines”, he argues. The main operation of these machines is to make everything “fit to be evaluated”. Because of this social technology, individual researchers or individual research groups are not able to evade evaluation without jeopardizing their career. At the same time, there is also a good non-Foucauldian reason for evaluation: evaluation is part of the democratic accountability of science.

This may be the key point in re-thinking our research evaluation systems. We must solve a dilemma: on the one hand we need evaluation machines because science has become too important and too complex to do without, and on the other hand evaluation machines tend to start to lead a life of their own and reshape the dynamics of research in potentially harmful ways. Therefore, we need to re-establish the connection between evaluation machines in science and the expert evaluation that is already driving the primary process of knowledge creation. In other words, it is fine that research evaluation has become so professionalized that we have specialized experts in addition to the researchers involved. But this evaluation machine should be organized in such a way that the evaluation process becomes a valuable component of the very process of knowledge creation that it wants to evaluate. This, I think, is the key challenge for both the new Dutch evaluation protocol and the British REF.

Would it be possible to enjoy this type of evaluation?

References:

Dahler-Larsen, P. (2012). The Evaluation Society (p. 280). Stanford University Press. Retrieved from http://www.amazon.com/The-Evaluation-Society-Peter-Dahler-Larsen/dp/080477692X

Tijdink, J. K., Rijcke, S. De, Vinkers, C. H., Smulders, Y. M., & Wouters, P. (2014). Publicatiedrang en citatiestress. Nederlands Tijdschrift Voor Geneeskunde, 158, A7147.

Tijdink, J. K., Vergouwen, A. C. M., & Smulders, Y. M. (2012). De gelukkige wetenschapper. Nederlands Tijdschrift Voor Geneeskunde, 156, 1–5.

The new Dutch research evaluation protocol

From 2015 onwards, the societal impact of research will be a more prominent measure of success in the evaluation of research in the Netherlands. Less emphasis will be put on the number of publications, while the vigilance about research integrity will be increased. These are the main elements of the new Dutch Standard Evaluation Protocol which was published a few weeks ago.

The new protocol aims to guarantee, improve, and make visible the quality and relevance of scientific research at Dutch universities and institutes. Three aspects are central: scientific quality; societal relevance; and feasibility of the research strategy of the research groups involved. As is already the case in the current protocol, research assessments are organized by institution, and the institutional board is responsible. Nationwide comparative evaluations by discipline are possible, but the institutions involved have to agree explicitly to organize their assessments in a coordinated way to realize this. In contrast to performance based funding systems, the Dutch system does not have a tight coupling between assessment outcomes and funding for research.

This does not mean, inter alia, that research assessments in the Netherlands do not have consequences. On the contrary, these may be quite severe but they will usually be implemented by the university management with considerable leeway for interpretation of the assessment results. The main channel through which Dutch research assessments has implications is via the reputation gained or lost for the research leaders involved. The effectiveness of the assessments is often decided by the way the international committee works which performs the evaluation. If they see it as their main mission to celebrate their nice Dutch colleagues (as has happened in the recent past), the results will be complimentary but not necessarily very informative. On the other hand, they may also punish groups by using criteria that are actually not valid for those specific groups although they may be standard for the discipline as a whole (and this has also happened, for example when book-oriented groups work in a journal-oriented discipline).

The protocol does not include a uniform set of requirements or indicators. The specific mission of the research institutes or university departments under assessment is leading. As a result, research that is mainly aimed at having practical impact may be evaluated with different criteria from a group that aims to work on the international frontier of basic research. The protocol is not unified around substance but around procedure. Each group has to be evaluated every six years. A new element in the protocol is also that the scale for assessment has been changed from a five-point to a four-point scale, ranging from “unsatisfactory”, via “good” and “very good” to “excellent”. This scale will be applied to all three dimensions: scientific quality, societal relevance, and feasibility.

The considerable freedom that the peer committees have in evaluating Dutch research has been maintained in the new protocol. Therefore, it remains to be seen what the effects will be of the novel elements in the protocol. In assessing the societal relevance of research, the Dutch are following their British peers. Research groups will have to construct “narratives” which explain the impact their research has had on society, understood broadly. It is not yet clear how these narratives will be judged according to the scale. The criteria for feasibility are even less clear: according to the protocol a group has an “excellent” feasibility if it is “excellently equipped for the future”. Well, we’ll see how this works out.

With less emphasis on the amount of publications in the new protocol, the Dutch universities, the funding agency NWO and the academy of science KNAW (who collectively are reponsible for the protocol) have also responded to the increased anxiety about “perverse effects” in the research system triggered by the ‘Science in Transition’ group and to recent cases of scientific fraud. The Dutch minister of education, culture and the sciences Jet Bussemaker welcomed this change. “Productivity and speed should not be leading considerations for researchers”, she said at the reception of the new protocol. I fully agree with this statement, yet this aspect of the protocol will also have to stand the test of practice. In many ways, the number of publications is still a basic building block of scientific or scholarly careers. For example, the h-index is very popular in the medical sciences  ((Tijdink, Rijcke, Vinkers, Smulders, & Wouters, 2014). This index is a combination of the number of publications of a researcher and the citation impact of these articles in such a way that the h-index can never be higher than the total number of publications. This means that if researchers are compared according to the h-index, the most productive ones will prevail. We will have to wait and see whether the new evaluation protocol will be able to withstand this type of reward for high levels of article production.

Reference: Tijdink, J. K., Rijcke, S. De, Vinkers, C. H., Smulders, Y. M., & Wouters, P. (2014). Publicatiedrang en citatiestress. Nederlands Tijdschrift Voor Geneeskunde, 158, A7147.

Metrics in research assessment under review

This week the Higher Education Funding Council for England (HEFCE) published a call to gather “views and evidence relating to the use of metrics in research assessment and management” http://www.hefce.ac.uk/news/newsarchive/2014/news87111.html. The council has established an international steering group which will perform an independent review of the role of metrics in research assessment. The review is supposed to contribute to the next installment of the Research Excellence Framework (REF) and will be completed Spring 2015.

Interestingly, two members of the European ACUMEN project http://research-acumen.eu/ are members of the 12 person steering group – Mike Thelwall (professor of cybermetrics at Wolverhampton University http://cybermetrics.wlv.ac.uk/index.html) and myself – and it is led by James Wilsdon, professor of Science and Democracy at the Science Policy Research Unit (SPRU) at the University of Sussex. The London School of Economics scholar Jane Tinkler, co-author of the book The Impact of the Social Sciences, is also member and has put together some reading material on their blog http://blogs.lse.ac.uk/impactofsocialsciences/2014/04/03/reading-list-for-hefcemetrics/. So there will be ample input from the social sciences to analyze both the promises and the pitfalls of using metrics in the British research assessment procedures. The British clearly see this as an important issue. The creation of the steering group was announced by the British minister for universities and science, David Willett at the Universities UK conference on April 3 https://www.gov.uk/government/speeches/contribution-of-uk-universities-to-national-and-local-economic-growth. In addition to science & technology studies experts, the steering group consists of scientists from the most important stakeholders in the British science system.

At CWTS, we responded enthusiastically to the invitation by HEFCE to contribute to this work, because this approach resonates so well with the CWTS research programme http://www.cwts.nl/pdf/cwts_research_programme_2012-2015.pdf. The review will focus on: identifying useful metrics for research assessment; how metrics should be used in research assessment; ‘gaming’ and strategic use of metrics; and the international perspective.

All the important questions about metrics have been put on the table by the steering group, among others:

-       What empirical evidence (qualitative or quantitative) is needed for the evaluation of research, research outputs and career decisions?

-       What metric indicators are useful for the assessment of research outputs, research impacts and research environments?

-       What are the implications of the disciplinary differences in practices and norms of research culture for the use of metrics?

-       What evidence supports the use of metrics as good indicators of research quality?

-       Is there evidence for the move to more open access to the research literature to enable new metrics to be used or enhance the usefulness of existing metrics?

-       What evidence exists around the strategic behaviour of researchers, research managers and publishers responding to specific metrics?

-       Has strategic behaviour invalidated the use of metrics and/or led to unacceptable effects?

-       What are the risks that some groups within the academic community might be disproportionately disadvantaged by the use of metrics for research assessment and management?

-       What can be done to minimise ‘gaming’ and ensure the use of metrics is as objective and fit-for-purpose as possible?

The steering group also calls for evidence on these issues from other countries. If you wish to contribute evidence to the HEFCE review, please make it clear in your response whether you are responding as an individual or on behalf of a group or organisation. Responses should be sent to metrics@hefce.ac.uk by noon on Monday 30 June 2014. The steering group will consider all responses received by this deadline.

 

 

On citation stress and publication pressure

Our article on citation stress and publication pressure in biomedicine went online this week – co-authored with colleagues from the Free University and University Medical Centre Utrecht:

Tijdink, J.K., S. de Rijcke, C.H. Vinkers, Y.M. Smulders, P.F. Wouters, 2014. Publicatiedrang en citatiestress: De invloed van prestatie-indicatoren op wetenschapsbeoefening. Nederlands Tijdschrift voor Geneeskunde 158: A7147.

* Dutch only *

Tales from the field: On the (not so) secret life of performance indicators

* Guest blog post by Alex Rushforth *

In the coming months Sarah De Rijcke and I have been accepted to present at conferences in Valencia and Rotterdam on research from CWTS’s nascent EPIC working group. We very much look forward to drawing on collaborative work from our ongoing ‘Impact of indicators’ project on biomedical research in University Medical Centers (UMC) in the Netherlands. One of our motivations behind the project is that there has been a wealth of social science literature in recent times about the effects of formal evaluation in public sector organisations, including universities. Yet too few studies have taken seriously the presence of indicators in the context of one of the universities core-missions: knowledge creation. Fewer still have looked to take an ethnographic lens to the dynamics of indicators in the day-to-day work context of academic knowledge. These are deficits we hope to begin addressing through these conferences and beyond.

The puzzle we will be addressing here appears – at least at first glance- straightforward enough: what is the role of bibliometric performance indicators in the biomedical knowledge production process? Yet comparing provisional findings from two contrasting case studies of research groups from the same UMC – one a molecular biology group and the other a statistics group – it becomes quickly apparent that there can be no general answer to this question. As such we aim to provide not only an inventory of different ‘roles’ of indicators in these two cases, but also to pose the more interesting analytical question of what conditions and mechanisms explain the observed variations in the roles indicators come to perform?

Owing to their persistent recurrence in the data so far, the indicators we will analyze are journal impact factor, H-index, and ‘advanced’ citation-based bibliometric indicators. It should be stressed that our focus on these particular indicators have have emerged inductively from observing first-hand the metrics that research groups attended to in their knowledge-making activities. So what have we found so far?

Dutch UMCs constitute particularly apt sites through which to explore this problem given how bibliometric assessments have been central to the formal evaluations carried-out since their inception in the early-2000s. On one level it is argued that researchers in both cases encounter such metrics as ‘governance/managerial devices’, that is, as forms of information required of them by external agencies on whom they are reliant for resources and legitimacy. Such examples can be seen when funding applications, annual performance appraisals, or job descriptions demand such information of an individual’s or group’s past performance. As the findings will show, the information needed by the two groups to produce their work effectively and the types of demands made on them by ‘external’ agencies varies considerably, despite their common location in the same UMC. This is one important reason why the role of indicators differs between cases.

However, this coercive ‘power over’ account is but one dimension of a satisfying answer to our role of indicators question. Emerging analysis reveals also the surprising discovery that in fields characterized by particularly integrated forms of coordination and standardization (Whitley, 2000)– like our molecular biologists – indicators in fact have the propensity to function as a core feature of the knowledge making process. For instance, a performance indicator like the journal impact factor was routinely mobilized informally in researchers’ decision-making as an ad hoc standard against which to evaluate the likely uses of information and resources, and in deciding whether time and resources should be spent pursuing them. By contrast in the less centralized and integrated field statistical research such an indicator was not so indispensable to routines of knowledge making activities. In the case of the statisticians it is possible to speculate that indicators are more likely to emerge intermittently as conditions to be met for gaining social and cultural acceptance by external agencies, but are less likely to inform day-to-day decisions. Through our ongoing analysis we aim to unpack further how disciplinary practices interact with organisation of Dutch UMCs to produce quite varying engagements with indicators.

The extent to which indicators play central/peripheral roles in research production processes across academic contexts is an important sociological problem to be posed in order to enhance understanding of the complex role of performance indicators in academic life. We feel much of the existing literature on evaluation of public organisations has tended to paint an exaggerated picture of formal evaluation and research metrics as synonymous with empty ritual and legitimacy (e.g. Dahler-Larsen, 2012). Emerging results here show that – at least in the realm of knowledge production- the picture is more subtle. This theoretical insight will prompt us to suggest further empirical studies are needed of scholarly fields with different patterns of work organisation in order to compare our results and develop middle-range theorizing on the mechanisms through which metrics infiltrate knowledge production processes to fundamental or peripheral degrees. In future this could mean venturing into fields far outside of biomedicine, such as history, literature, or sociology. For now though we look forward to expanding the biomedical project, by conducting analogous case studies from a second UMC.

Indeed it is through such theoretical developments that we can consider not only the appropriateness of one-size-fits-all models of performance evaluation, but also unpack and problematize discourses about what constitutes ‘misuse’ of metrics. And indeed how convinced should we be that academic life is now saturated and dominated by deleterious metric indicators? 

References

DAHLER-LARSEN, P. 2012. The evaluation society, Stanford, California, Stanford Business Books, an imprint of Stanford University Press.

 WHITLEY, R. 2000. The intellectual and social organization of the sciences, Oxford England ; New York, Oxford University Press.

How does science go wrong?

We are happy to announce that our abstract got accepted for the 2014 Conference of the European Consortium for Political Research (ECPR), which will be held in Glasgow from 3-6 September. Our paper is selected for a panel on ‘The role of ideas and indicators in science policies and research management’, organised by Luis Sanz-Menéndez and Laura Cruz-Castro (both at CSIC-IPP).

Title of our paper: How does science go wrong?

“Science is in need of fundamental reform.” In 2013, five Dutch researchers took the lead in what they hope will become a strong movement for change in the governance of science and scholarship: Science in Transition. SiT appears to voice concerns heard beyond national borders about the need for change in the governance of science (cf. The Economist 19 October 2013; THE 23 Jan. 2014; Nature 16 Oct. 2013; Die Zeit 5 Jan. 2014). One of the most hotly debated concerns is quality control, and it encompasses the implications of a perceived increasing publication pressure, purported flaws in the peer review system, impact factor manipulation, irreproducibility of results, and the need for new forms of data quality management.

One could argue that SiT landed in fertile ground. In recent years, a number of severe fraud cases drew attention to possible ‘perverse effects’ in the management system of science and scholarship. Partly due to the juicy aspects of most cases of misconduct, these debates tend to focus on ‘bad apples’ and shy away from more fundamental problems in the governance of science and scholarship.

Our paper articulates how key actors construct the notion of ‘quality’ in these debates, and how they respond to each other’s position. By making these constructions explicit, we shift focus back to the self-reinforcing ‘performance loops’ that most researchers are caught up in at present. Our methodology is a combination of the mapping of the dynamics of media waves (Vasterman, 2005) and discourse analysis (Gilbert & Mulkay, 1984).

References

A revolutionary mission statement: improve the world. Times Higher Education, 23 January 2014.

Chalmers, I., Bracken, M. B., Djulbegovic, B., Garattini, S., Grant, J., Gülmezoglu, A. M., Oliver, S. (2014). How to increase value and reduce waste when research priorities are set. The Lancet, 383 (9912), 156–165.

Gilbert, G. N., & Mulkay, M. J. (1984). Opening Pandora’s Box. A Sociological Analysis of Scientists’ Discourse. Cambridge: Cambridge University Press.

Research evaluation: Impact. (2013). Nature, 502(7471), 287–287.

Rettet die Wissenschaft!: “Die Folgekosten können hoch sein.” Die Zeit, 5 January 2014.

Trouble at the lab. The Economist, 19 October 2013.

Vasterman, P. L. M. (2005). Media-Hype. European Journal of Communication , 20 (4 ), 508–530.

%d bloggers like this: