CWTS part of H2020 COST Action to stimulate integrity and responsible research

Good news came our way recently! Thed van Leeuwen, Paul Wouters and myself will be part of an EC-funded H2020 COST Action on Promoting Integrity as an Integral Dimension of Excellence in Research (PRINTEGER). Main applicants Hub Zwart and Willem Halffman (Radboud University Nijmegen) brought together highly skilled partners for this network from the Free University Brussels, the University of Tartu (Estonia), Oslo and Akershus University College, Leiden University, and the Universities of Bonn, Bristol, and Trento.

The primary goal of the COST Action is to encourage a research culture that treats integrity as an integral part of doing research, instead of an externally driven steering mechanism. Our starting point: in order to stimulate integrity and responsible research, new forms of governance are needed that are firmly grounded in and informed by research practice.

Concretely, the work entailed in the project will consist of A) a systematic review of integrity cultures and practices; B) an analysis and assessment of current challenges, pressures, and opportunities for research integrity in a demanding and rapidly changing research system; and C) the development and testing of tools and policy recommendations enabling key players to effectively address issues of integrity, specifically directed at science policy makers, research managers and future researchers.

CWTS will contribute to the network with

  • A bibliometric analysis of ‘traces of fraud’ (e.g. retracted articles, manipulative editorials, non-existent authors and papers, fake journals, bogus conferences, non-existent universities), against the background of general shifts in publication patterns, such as changing co-authoring practices, instruments as authors, or the rise of hyper-productive authors;
  • Two in-depth cases studies of research misconduct, not the evident or spectacular, but more particularly reflecting dilemmas and conflicts that occur in grey areas. Every partner will provide two cases; ours will most likely focus on cases of questionable integrity of journal editors (for example cases of impact factor manipulation);
  • Act as task leader on formulation of Advice for research support organisations, including on IT tools. This task will draw conclusions from the research on the operation of the research system, specifically publication infrastructures such as journals, libraries, or data repositories;
  • Like all other partners in the network, we will set up small local advisory panels consisting of five to ten key stakeholders of the project: research policy makers, research leaders or managers, research support organisations, and early career scientists. These panels will meet for a scoping consultation at the start of the projects, for a halfway consultation to discuss intermediate results and further choices to be made, and for a near-end consultation to test the pertinence of tools and advice at a point where we can still make changes to accommodate for stakeholder input.

Ethics and misconduct – Review of a play organized by the Young Academy (KNAW)

This is a guest blog post by Joost Kosten. Joost is PhD student at CWTS and member of the EPIC working group. His research focuses on the use of research indicators from the perspective of public policy. Joost obtained an MSc in Public Administration (Leiden University) and was also trained in Political Science (Stockholm University) and Law (VU University Amsterdam).

Scientific (mis)conduct – The sins, the drama, the identification

On Tuesday November 18th 2014 the Young Academy of the Royal Netherlands Academy of Sciences organized a performance of the play Gewetenschap by Tony Maples at Leiden University. These weeks, Pandemonia Science Theater is on tour in the Netherlands to perform this piece at several universities. Gewetenschap was inspired by occasional troubles with respect to ethics and integrity which recently occurred in Dutch science and scholarship. Although these troubles concerned grave violations of the scientific code of conduct (i.e., the cardinal sins of fraud, fabrication, and plagiarism) the play focusses on common dilemma’s in a researcher’s everyday life. The title Gewetenschap is a non-existent word, which combines the Dutch words geweten (conscience) and wetenschap (science).

The playwright used confidential interviews with members of the Young Academy to gain insight into the most frequently occurring ethical dilemma’s researchers have to deal with. Professor Karin de Zwaan is a research group leader who has hardly any time to do research herself. She puts much effort in organizing grants, attracting new students and organizing her research group. Post-doc Jeroen Dreef is a very active researcher who does not have enough time to take organizational responsibilities serious. A tenure track is all he wants. Given their other important activities, Karin and Jeroen hardly have any time to supervise PhD student Lotte. One could question the type of support they do give her.

At times, given the reaction on scenes of the drama piece, the topics presented were clearly recognized by the audience. Afterwards, the dilemma’s touched upon during the play are presented by prof. Bas Haring. The audience discusses the following topics:

  • Is there a conflict between the research topics a researcher likes himself and what the research group expects her/him to do?
  • In one of the scenes, the researchers were delighted because of the acceptance of a publication. Haring asks if that exhibits “natural behaviour”. Shouldn’t a researcher be happy with good results instead of a publication being accepted? One of the participants replies that a publication functions as a reward.
  • What do you do with your data? Is endless application of a diversity analysis methods until you find nice results a responsible approach?
  • What about impact factors (IF)? Bas Haring himself says his IF is 0. “Do you think I am an idiot?” Which role do numbers such as the IF play in your opinion about colleagues? There seems to be quite a diversity of opinions. An early career research says everone knows these numbers are nonsense. An experienced scientist points out that there is a correlation between scores and quality. Someone else expresses his optimism since he expects that this focus on numbers will be over with ten years. This causes another to respond that in the past there was competition too, but in a different way.
  • When is someone a co-author? This question results in a lively debate. Apparently, there are considerable differences from field to field. In the medical fields, a co-authorship can be a way to express gratitude to authors who have played a vital role in a research project, such as people who could organize experimental subjects. In this way, a co-authorship becomes a tradeable commodity. A medicine professor points out that in his field, co-authorships can be used to compare a curriculum vitae with the development of status as a researcher. Thus, it can be used as a criterion to judge grant proposals. A good researcher should start with first position co-authorships, later on should have co-authorships somewhere in between the first and last author, and should end his career with papers in which has co-authorships in the last position. Thus, the further the career has been developed, the more the name of the other should be in the final part of the author list. Another participant states that one can deal with co-authorships in three different ways: 1. Co-authors should always have full responsibility for everything in the paper. 2. Similar to openness which is given at the end of a movie, co-authors should clarify what each co-author’s contribution was. 3. Only those who really contributed in writing a paper can be a co-author. The participant admits that this last proposal works in his own field but might not work in other fields.
  • Can a researcher exaggerate his findings if he presents them to journalists? Should you keep control over a journalist’s work in order to avoid that he will present things differently? Is it allowed to present untruth information in order to help support your case, just to avoid that a proper scientific argumentation will be too complex for the man in the street?
  • Is it allowed to to present your work as having more societal relevance than you really expect? One of the reactions is that researchers are forced to express the societal relevance of their work when they apply for a grant. From the very nature of scientific research it is hardly possible to clearly indicate what society will gain from the results.
  • What does a good relationship between a PhD-student and a supervisor look like? What is a good balance between serving the interests of PhD students, serving organizational interests (e.g. the future of the organization by attracting new students and grants), and the own interest of the researcher?

The discussion did not concentrate on the following dilemma´s presented in Gewetenschap:

  • To what extent are requirements for grant proposals contradictory? On the one hand, researchers are expected to think ‘out-of-the-box’ while on the other hand they should meet a large amount of requirements. Moreover, should one propose new ideas including the risks which come along, or is it better to walk on the beaten path in order to guarantee successes?
  • Should colleagues who did not show respect be served with the same sauce if you have a chance to review their work? Should you always judge scientific work on its merits? Are there any principles of ‘due process’ which should guide peer review?
  • Whose are the data if someone contributed to them but moves to another research group or institute?

 

The new Dutch research evaluation protocol

From 2015 onwards, the societal impact of research will be a more prominent measure of success in the evaluation of research in the Netherlands. Less emphasis will be put on the number of publications, while the vigilance about research integrity will be increased. These are the main elements of the new Dutch Standard Evaluation Protocol which was published a few weeks ago.

The new protocol aims to guarantee, improve, and make visible the quality and relevance of scientific research at Dutch universities and institutes. Three aspects are central: scientific quality; societal relevance; and feasibility of the research strategy of the research groups involved. As is already the case in the current protocol, research assessments are organized by institution, and the institutional board is responsible. Nationwide comparative evaluations by discipline are possible, but the institutions involved have to agree explicitly to organize their assessments in a coordinated way to realize this. In contrast to performance based funding systems, the Dutch system does not have a tight coupling between assessment outcomes and funding for research.

This does not mean, inter alia, that research assessments in the Netherlands do not have consequences. On the contrary, these may be quite severe but they will usually be implemented by the university management with considerable leeway for interpretation of the assessment results. The main channel through which Dutch research assessments has implications is via the reputation gained or lost for the research leaders involved. The effectiveness of the assessments is often decided by the way the international committee works which performs the evaluation. If they see it as their main mission to celebrate their nice Dutch colleagues (as has happened in the recent past), the results will be complimentary but not necessarily very informative. On the other hand, they may also punish groups by using criteria that are actually not valid for those specific groups although they may be standard for the discipline as a whole (and this has also happened, for example when book-oriented groups work in a journal-oriented discipline).

The protocol does not include a uniform set of requirements or indicators. The specific mission of the research institutes or university departments under assessment is leading. As a result, research that is mainly aimed at having practical impact may be evaluated with different criteria from a group that aims to work on the international frontier of basic research. The protocol is not unified around substance but around procedure. Each group has to be evaluated every six years. A new element in the protocol is also that the scale for assessment has been changed from a five-point to a four-point scale, ranging from “unsatisfactory”, via “good” and “very good” to “excellent”. This scale will be applied to all three dimensions: scientific quality, societal relevance, and feasibility.

The considerable freedom that the peer committees have in evaluating Dutch research has been maintained in the new protocol. Therefore, it remains to be seen what the effects will be of the novel elements in the protocol. In assessing the societal relevance of research, the Dutch are following their British peers. Research groups will have to construct “narratives” which explain the impact their research has had on society, understood broadly. It is not yet clear how these narratives will be judged according to the scale. The criteria for feasibility are even less clear: according to the protocol a group has an “excellent” feasibility if it is “excellently equipped for the future”. Well, we’ll see how this works out.

With less emphasis on the amount of publications in the new protocol, the Dutch universities, the funding agency NWO and the academy of science KNAW (who collectively are reponsible for the protocol) have also responded to the increased anxiety about “perverse effects” in the research system triggered by the ‘Science in Transition’ group and to recent cases of scientific fraud. The Dutch minister of education, culture and the sciences Jet Bussemaker welcomed this change. “Productivity and speed should not be leading considerations for researchers”, she said at the reception of the new protocol. I fully agree with this statement, yet this aspect of the protocol will also have to stand the test of practice. In many ways, the number of publications is still a basic building block of scientific or scholarly careers. For example, the h-index is very popular in the medical sciences  ((Tijdink, Rijcke, Vinkers, Smulders, & Wouters, 2014). This index is a combination of the number of publications of a researcher and the citation impact of these articles in such a way that the h-index can never be higher than the total number of publications. This means that if researchers are compared according to the h-index, the most productive ones will prevail. We will have to wait and see whether the new evaluation protocol will be able to withstand this type of reward for high levels of article production.

Reference: Tijdink, J. K., Rijcke, S. De, Vinkers, C. H., Smulders, Y. M., & Wouters, P. (2014). Publicatiedrang en citatiestress. Nederlands Tijdschrift Voor Geneeskunde, 158, A7147.

How does science go wrong?

We are happy to announce that our abstract got accepted for the 2014 Conference of the European Consortium for Political Research (ECPR), which will be held in Glasgow from 3-6 September. Our paper is selected for a panel on ‘The role of ideas and indicators in science policies and research management’, organised by Luis Sanz-Menéndez and Laura Cruz-Castro (both at CSIC-IPP).

Title of our paper: How does science go wrong?

“Science is in need of fundamental reform.” In 2013, five Dutch researchers took the lead in what they hope will become a strong movement for change in the governance of science and scholarship: Science in Transition. SiT appears to voice concerns heard beyond national borders about the need for change in the governance of science (cf. The Economist 19 October 2013; THE 23 Jan. 2014; Nature 16 Oct. 2013; Die Zeit 5 Jan. 2014). One of the most hotly debated concerns is quality control, and it encompasses the implications of a perceived increasing publication pressure, purported flaws in the peer review system, impact factor manipulation, irreproducibility of results, and the need for new forms of data quality management.

One could argue that SiT landed in fertile ground. In recent years, a number of severe fraud cases drew attention to possible ‘perverse effects’ in the management system of science and scholarship. Partly due to the juicy aspects of most cases of misconduct, these debates tend to focus on ‘bad apples’ and shy away from more fundamental problems in the governance of science and scholarship.

Our paper articulates how key actors construct the notion of ‘quality’ in these debates, and how they respond to each other’s position. By making these constructions explicit, we shift focus back to the self-reinforcing ‘performance loops’ that most researchers are caught up in at present. Our methodology is a combination of the mapping of the dynamics of media waves (Vasterman, 2005) and discourse analysis (Gilbert & Mulkay, 1984).

References

A revolutionary mission statement: improve the world. Times Higher Education, 23 January 2014.

Chalmers, I., Bracken, M. B., Djulbegovic, B., Garattini, S., Grant, J., Gülmezoglu, A. M., Oliver, S. (2014). How to increase value and reduce waste when research priorities are set. The Lancet, 383 (9912), 156–165.

Gilbert, G. N., & Mulkay, M. J. (1984). Opening Pandora’s Box. A Sociological Analysis of Scientists’ Discourse. Cambridge: Cambridge University Press.

Research evaluation: Impact. (2013). Nature, 502(7471), 287–287.

Rettet die Wissenschaft!: “Die Folgekosten können hoch sein.” Die Zeit, 5 January 2014.

Trouble at the lab. The Economist, 19 October 2013.

Vasterman, P. L. M. (2005). Media-Hype. European Journal of Communication , 20 (4 ), 508–530.

May university rankings help uncover problematic or fraudulent research?

Can one person manipulate the position of a whole university in a university ranking such as the Leiden Ranking? The answer is, unfortunately, sometimes yes – provided the processes of quality control in journals do not function properly. A Turkish colleague recently alerted us to the position of Ege University in the most recent Leiden Ranking in the field of mathematics and computer science. This university, not previously known as one of the prestigious Turkish research universities, ranks second with an astonishing value of the PP(top 10%) indicator of almost 21%. In other words, 21% of the mathematics and computer science publications of Ege University belong to the top 10% most frequently cited in their field. This means that Ege University is supposed to have produced twice the amount of highly cited papers as expected. Only Stanford University has performed better.

In mathematics and computer science, Ege university has produced 210 publications (Stanford wrote almost ten times as much). Because this is a relatively small number of publications, the reliability of the ranking position is fairly low, which is indicated by a broad stability interval (an indication of the uncertainty in the measurement). Of the 210 Ege University publications, no less than 65 have been created by one person, a certain Ahmet Yildirim. This is an extremely high productivity in only 4 years in this specialty. Moreover, the Yildirim publications are indeed responsible for the high ranking of Ege University: without them, Ege University would rank around position 300 in this field. This position is therefore probably a much better reflection of its performance in this field. Yildirim’s publications have attracted 421 citations, excluding the self-citations. Mathematics is not a very citation dense field, so this level of citations is able to strongly influence both the PP(top10%) and the MNCS indicators.

An investigation into Yildirim’s publications has not yet started, as far as we know. But suspicions of fraud and plagiarism are rising, both in Turkey and abroad. One of his publications, in the journal Mathematical Physics, has recently been retracted by the journal because of evident plagiarism (pieces of an article by a Chinese author were copied and presented as original). Interestingly, the author has not agreed with this retraction. A fair number of Yildirim’s publications have been published in journals with a less than excellent track record in quality control. The Elsevier journal Computer & Mathematics with Applications (11 articles by Yildirim) has recently retracted an article by a different author because it turned out to have “no scientific content”. Actuallly, it was an almost empty publication. According to Retraction Watch, the journal’s editor Ervin Rodin has been replaced at the end of last year. He was also relieved from his editorial position at the journal Applied Mathematics Letters – An International Journal of Rapid Publication, another Elsevier imprint. Rodin was also editor of Mathematical and Computer Modelling, in which Yildirim published 5 articles. The latter journal currently does not accept any submissions “due to an editorial reconstruction”.

How did Yildirim’s publications attract so many citations? His 65 publications are cited by 285 publications, giving in total 421 citations. This group of publications has a strong internal citation traffic. They have attracted almost 1200 citations, of which a bit more than half is generated within this group. In other words: this set of publications seems to represent a closely knit group of authors, but they are not completely isolated from other authors. If we look at the universities citing Ege University, none of them have a high rank in the Leiden Ranking with the exception of Penn State University (which ranks at 112) that has cited Yildirim once. If we zoom in on mathematics and computer science, virtually all of the citing universities do not rank highly either, with the exception of Penn State (1 publication) and Gazi University (also 1 publication). The rank position of the last university, by the way, is not so reliable either, as indicated by the stability interval that is almost as wide as in the case of Ege University.

The bibliometric evidence allows for two different conclusions. One is that Yildirim is a member of a community which works closely together on an important mathematical problem. The alternative interpretation is that this group is a distributed citation cartel which not only exchanges citations but also produces very similar publications in journals that are functioning mainly as citation generating devices. A cursory look at a sample of the publications and the way the problems are formulated seems to support the second interpretation more than the first.

But from this point, the experts in mathematics should take over. Bibliometrics is currently not able to properly distinguish sense from nonsense in scientific publications. Expertise in the field is required for this task. We have informed the rector of Ege University that the ranking of his university is doubtful and requested more information from him about the position of the author. We have not yet received a reply. If Ege University wishes to be taken seriously, it should start a thorough investigation of the publications by Yildirim and his co-authors.

If you see other strange rankings in our Leiden Ranking or in any other ranking, please do notify us. It may help us create better tools to uncover fraudulent behaviour in academic scholarship.

Fraud in Flemish science

Almost half of Flemish medical researchers have witnessed a form of scientific fraud in their direct environment. One in twelve have been engaged themselves in data fraud or in “massaging data” in order to make the results fit the hypothesis. Many mention “publication pressure” as an important cause of this behaviour. This is the outcome of the first public survey among Flemish medical researchers about scientific fraud. The survey was conducted in November and December 2012 by the journal Eos . Joeri Tijdink, who had conducted a similar survey in the Netherlands among medical professors supervised the Flemish survey.

It is not clear to what extent the survey results are representative of the conduct of all medical researchers in Flanders. The survey was distributed through the deans of medical faculties in the form of an anonymous questionnaire. The response rate was fairly low (19 % of the 2,548 researchers responded and 315 (12 %) filled it in completely). Yet, the results indicate that fraud may be a much more serious problem than is usually acknowledged in the Flemish scientific system. Since the installation of Flemish university committees on scientific integrity, no more than 4 cases of scientific misconduct have been recognized (3 involved plagiarism; 1 researcher committed fraud). This is clearly lower than expected. The survey, however, consistently reports higher incidence of scientific misconduct than comparable international surveys do. For example, having witnessed misconduct is reported by 14% of researchers according to a meta-study by Daniele Fanelli, but in Flanders this is 47%. Internationally, 2% of researchers admit to have been involved themselves in data massage or fraud, whereas in Flanders this is 8%. The discrepancy can be explained in two ways. One is that the university committees are not yet effective in getting out the truth. The other is that this survey is biased towards researchers who have witnessed misconduct in some way. Given that both explanations seem plausible, the gap between the survey results and the formal record of misconduct in Flanders may best be explained by a combination of both mechanisms. After all, it is hard to understand why Flemish medical researchers would be more (or less)  prone to misconduct than medical researchers in, say, the Netherlands, the UK, or France.

According to Eos, publication pressure is one of the causes of misconduct. This still remains to be proven. However, both in the earlier survey by Tijdink and Smulders, and in this survey, a large number of researchers mention “publication pressure” as a driving factor. As has been argued in the Dutch debate about the fraud by psychologist Diederik Stapel, the mentioning of “publication pressure” as a cause may be motivated by a desire for legitimation. After all, all researchers are pressured to publish on a regular basis, while a small minority is involved in misconduct (as far as we know now). So the response may be part of a justification discourse, rather than a causal analysis. My own intuition is that the problem is not publication pressure, but reputation pressure, a subtle but important difference. Nevertheless, if a large minority (47% of the Flemish respondents for example) of researchers point to “publication pressure” as a cause of misconduct, we may have a serious problem in the scientific system, whether or not these researchers are right. A problem that can no longer be ignored.

Literature:

Fanelli D (2009) How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data. PLoS ONE 4(5): e5738. doi:10.1371/journal.pone.0005738

Joeri K. Tijdink, Anton C.M. Vergouwen, and Yvo M. Smulders, Ned Tijdschr Geneeskd. 2012;156:A5715

%d bloggers like this: