The evidence on the Journal Impact Factor

The San Francisco Declaration on Research Assessment (DORA), see our most recent blogpost, focuses on the Journal Impact Factor, published in the Web of Science by Thomson Reuters. It is a strong plea to base research assessments of individual researchers, research groups and submitted grant proposals not on journal metrics but on article-based metrics combined with peer review. DORA cites a few scientometric studies to bolster this argument. So what is the evidence we have about the JIF?

In the 1990s, the Norwegian researcher Per Seglen, based at our sister institute the Institute for Studies in Higher Education and Research (NIFU) in Oslo and a number of CWTS researchers (in particular Henk Moed and Thed van Leeuwen) developed a systematic critique of the JIF, its validity as well as the way it is calculated (Moed & Van Leeuwen, 1996; Moed & Leeuwen, 1995; Seglen, 1997). This line of research has since blossomed in a variety of disciplinary contexts, and has identified three main reasons not to use the JIF in research assessments of individuals and research groups.

First, although the values of JIF of a particular journal depend on the aggregated citation rates of the individual articles, the JIF cannot be used as a stand-in for the latter in research assessments. This is because a small number of articles are cited very heavily, while a large number of articles are only cited once in a while, and some are not cited at all. This skweded distribution is a general phenomenon in citation patterns and it holds for all journals. Therefore, if a researcher has published an article in a high impact journal, this does not mean that her particular piece of research will also have a high impact.

Second, fields differ strongly in their usual JIF values. A field with a rapid turn-over of research publications and long reference lists (such as fields in biomedical research) will tend to have much higher JIF values for its journals than a field with short refence lists in which older publications remain relevant much longer (such as fields in mathematics). Moreover, smaller fields will usually have smaller number of journals, resulting in less possibilities to publish in high-impact journals. As a result, it does not make sense to compare JIF across fields. Although virtually everybody knows this, an implicit comparison is often still prevalent. This is for example the case when publications are compared on their JIF values in multi-disciplinary settings (such as in grant proposals reviews).

Third, the way in which the JIF is calculated in the Web of Science has a number of technical characteristics due to which the JIF can be gamed relatively easily by journal editors. The JIF is a division of total number of citations to the journal in the last two years by the number of “citeable publications”. Some publications do not count as “citeable” although they do contribute to the total number of citations if cited. By increasing the relative share of these publications in the journal, the editor can try to artifically increase his JIF value. This can also be accomplished by increasing the number of publications that are more frequently cited, such as review articles, long articles, or clinical trials. Last, the editor can try to convince or pressure submitting authors to cite more publications in the journal itself. All three forms of manipulations are occuring, although we do not really know how frequently this happens. Sometimes, the manipulation is plainly visible. Editors have been writing editorials about their citation impact, citing all publications in the past two years in their own journal, admonishing authors to increase their JIF!

A more generic problem with using the JIF in research assessment is that not all fields have meaningful JIF values, since they are only based on those journals in the Web of Science that have their JIF calculated. Scholarly fields focusing on books or technical designs are disadvantaged in evaluations in which the JIF is important.

In response to these problems, five main journal impact indicators have been developed as an improvement upon, or alternative to, the JIF. First, the CWTS Journal to Field Impact Score (JFIS) indicator improves upon the JIF because it does away with the difference in the numerator and denominator regarding “citeable items” and because it takes field differences in citation density into account. Second, the SCImago Journal Rank (SJR) indicator follows the same logic as Google’s PageRank algorithm: citations from highly cited journals have more influence than citations from lowly cited ones. SCImago, based in Madrid, calculates the SJR not on the basis of the basis of the Web of Science but on the basis of the Scopus citation database (published by Elsevier). A similar logic is applied in two other journal impact factors from the Eigenfactor.org research project, based at the biology department of the University of Washington (Seattle): the Eigenfactor and the Article Influence Score (AIS). These are often calculated on the basis of the Web of Science and use a ‘citation window’ of five years (citations to an article in the previous five years count), whereas this is two years in JIF and three years in SJR.

The fifth journal impact indicator is computed on the basis of Scopus by CWTS: the Source Normalized Impact per Paper indicator (SNIP) (invented by Henk Moed and further developed by Nees Jan van Eck, Thed van Leeuwen, Martijn Visser and Ludo Waltman (Waltman, Eck, Leeuwen, & Visser, 2012)). This indicator also weights citations but not on the basis of the number of citations to the citing journal, but on the basis of the number of references in the citing article. Basically, the citing paper is seen as giving out one vote which is distributed over all cited papers. As a result, a citation from a paper with 10 references adds 1/10th to the citation frequency, whereas a citation from a paper with 100 references adds only 1/100th. The effect is that the SNIP indicator cancels out differences across fields in citation density (though certainly not all relevant differences between disciplines, such as the amount of work that is needed to publish an article). The Eigenfactor also uses this principle in its implementation of the PageRank algorithm.

The improved journal impact indicators do solve a number of problems that have emerged in the use of the JIF. Nevertheless, careless use of the journal impact indicators in research assessments is not justified. All journal impact indicators are in the end based on the number of citations to the individual articles in the journal. The correlation is however too weak to legitimize the application of some journal indicator instead of the assessment of the articles themselves if one wishes to evaluate those articles. Whenever the journal indicators take the differences between fields into account, the number of citations to sets of articles produced by research groups as a whole tend to show a somewhat stronger correlation with the journal indicators. Still, the statistical correlation remains very modest. Research groups tend to publish across a whole range of journals with both high and lower impact factors. It will therefore usually be much more accurate to analyze the influence of these bodies of work rather than fall back on the journal indicators.

To sum up, the bibliometric evidence confirms the main thrust of DORA: it is not sensible to use the JIF or any other journal impact indicator as a predictor of the citedness of a particular paper or set of papers. But does this mean, as DORA seems to suggest, that journal impact factors do not make any sense at all? Here I think DORA is wrong. At the level of the journal the improved impact factors do give interesting information about the role and position of the journal, especially if this is combined with qualitative information about the peer review process, an analysis of who is citing the journal and in which context, and its editorial policies. No editor would want to miss the opportunity to use the analysis of its role in the scientific communication process, and journal indicators can play an informative, supporting, role. Also, it makes perfect sense in the context of research evaluation to take into account whether a researcher has been able to publish in a high quality scholarly journal. But journal impact factors should not rule the world.

Literature:

Moed, H. F., & Van Leeuwen, T. N. (1996). Impact factors can mislead. Nature, 381(6579), 186.

Moed, H., & Leeuwen, T. Van. (1995). Improving the accuracy of Institute for Scientific Information’s journal impact factors. JASIS, 46(6), 461–467. Retrieved from http://www.iem.ac.ru/~kalinich/rus-sci/ISI-CI-IF.pdf

Seglen, P. O. (1997). Why the impact factor of journals should not be used for evaluating research. BMJ (Clinical research ed.), 314(7079), 498–502. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2126010&tool=pmcentrez&rendertype=abstract

Waltman, L., & Eck, N. van, Leeuwen, & Visser. (2013). Some modifications to the SNIP journal impact indicator. Journal of Informetrics, 1–20. Retrieved from http://www.sciencedirect.com/science/article/pii/S1751157712001010

Acknowledgement:

I would like to thank Thed van Leeuwen and Ludo Waltman for their comments on an earlier draft of this post.

DORA – a stimulus for a new evaluation culture in science

We should urgently improve the ways in which the output of research is assessed by universities and funding agencies. Therefore, the dominance of the Journal Impact Factor in these evaluations should be terminated. This is the gist of a call published a week ago by a large group of prominent researchers and research institutes, the San Francisco Declaration on Research Assessment (DORA). This new initiative started at a conference in San Francisco last December, organized by the American Society for Cell Biology. This origin shows in the list and in the accompanying editorials. The declaration went live together with an editorial in Science and journals in the life sciences, such as EMBO journal, Molecular Biology of the Cell, eLife, and Traffic.  At the moment of writing, more than a thousand individual researchers have signed the declaration as well as over a hundred scientific institutions. Among them are AAAS, Wellcome Trust, EMBO, HEFCE, PNAS, PLOS and the Open Knowledge Foundation.

DORA has mostly been welcomed by experts in scientometrics and bibliometrics, science policy and leaders of academic institutions, and rightly so. This is not because they are declared enemies of the Journal Impact Factor, but because of the narrowmindedness of assessment systems centered around one indicator, which by definition can only capture a narrow slice of relevant dimensions in the assessment of scientific performance. DORA focuses on JIF, produced by Thomson Reuters in their Journal Citation Reports, but some of the arguments also hold for performance indicators in general. The strength of DORA is its plea for the recognition of the diversity of types of scientific output. This should be met by a diversity of measures, both qualitative and quantitative. Moreover, the increasingly web based style of working in science and scholarship enables more advanced and refined forms of measures of production, impact, and influence than the often rather crude approximation in indicators such as JIF (but this depends on what one wants to measure!).

DORA cites the critique of JIF as it has been developed in the decades of bibliometric and science policy research since the early 1990s. The main problems mentioned are strong varation of JIF values across fields due to which it does not make sense to compare JIF values in different fields or even sub-fields; the skewed distribution of the number of citations over the articles within a journal, due to which one cannot see the average as correlated to the prospective citation scores of an article; and the relatively easy ways in which JIF can be gamed by journal editors. This body of research is fairly well summarized, albeit not cited in a comprehensive way.

The main weaknesses of DORA show in the specific recommendations and in some confusion with respect to specific problems of JIF and more generic problems of performance indicators. For example, DORA seems to want to do away entirely with journal based indicators while it recommends additional journal indicators at the same time. (More on this in a next post.)

Yet, the main thrust of DORA is in line with the need to correct for, or warn against, too much reliance on formalized indicators in a lot of universities and institutes. This may have developed at the expense of a well-balanced form of informed peer review, although we also should not underestimate the large number of very well-designed evaluation work that is being conducted every day. Of course, peer review itself must also be kept honest by, among others, well-developed indicators of a variety of dimensions of the process of knowledge creation (such as network positions and gender relationships).

Last year, CWTS published its new research program. One of the main themes is precisely the urgent need to innovate the current systems of research assessment and the related need to support this with a new research agenda in scientometrics. (More on this in a next post). Also, at CWTS we are coordinating the European research project ACUMEN, which aims to support researchers in their evaluation moments by a portfolio of qualitative and quantitative evidence which is valid and reliable at the level of the individual researcher. This project is a large-scale collaboration with a host of scientometric, webometric and science policy experts and researchers. And we know that many of our colleagues are thinking along the same lines. So it should definitely be possible to build a strong coalition in favor of evaluation practices that are more conducive to the further development of science and creativity.

Next post: a summary of the evidence on JIF

Worldwide diversification of research continues

Last Wednesday, we published the new edition of the Leiden Ranking. The results are quite interesting. The range of countries with universities who score high on their number of highly cited publications is increasing. Thirteen countries are now listed in the top hundred of the world: the US (57 universities), UK (16), Switzerland and the Netherlands (each 6), China (4), Singapore, Canada and Germany (each 2), and Israel, Denmark, Ireland, South Korea and Australia (each with 1 university).

Clearly, the US is still dominating. The first 12 universities are all based in the US. Like last year, MIT is leading the ranking with no less than one quarter of its publications in the 10% most cited percentiles of their field (in this calculation, we also take into account the publication year). The largest research university in the world, Harvard, is number five with an impressive one-fifth of its papers published between 2008 and 2011 scoring in the 10% most cited papers of their field. Note that when the option “fractional counting” is vinked, a paper is attributed as an equal fraction of a paper to all universities mentioned as author address. This prevents double counting, but does not reflect the total number of papers originating from a university. For example, Harvard has produced almost 57,000 papers, but many of them with other universities, which results in a “fractionalized” number of almost 30,000 papers, of which one-fifth scores in the 10% most cited segment.

China is steadily increasing the impact of its research. Whereas in the recent past, China rose quickly in terms of the production of scientific papers but not so much in terms of scientific influence, we now see that research from Chinese universities is gaining citations. Two Chinese universities, Nankai and Hunan, are even scoring higher on the highly cited indicator than the highest ranking Dutch universities (Leiden University and Utrecht University). Almost 14.5% of their publications belong to the top 10% most cited in their field. The diversification also shows outside of the top 100 universities. For example, China has 37 universities in the Leiden Ranking 2013 (of which 6 are newcomers), Iran (all five are new), Brazil (10, 2 newcomers). This trend is the result of three effects. First, many universities are increasing their share of the scientific production. Second, at the same time, the number of scientific papers is rising as such, which results in a steady increase of the size of the Web of Science database, on which the Leiden Ranking is based. Third, we have become better in correctly identifying universities in the address field of the scientific publications. We suspect, for example, that this contributes to the rise of Iran in the Leiden Ranking.

Of course, the ranking also shows areas in which the citation impact is lower than expected. What struck me is that the Japanese universities (including the prestigious Tokyo University) all score lower than the world average. This is also true for all universities from some of the newcomers such as Iran. But also, somewhat more surprisingly, for Norway, Brazil, Poland, Italy, Greece, Portugal, Russia, Turkey, and Taiwan.

Fraud in Flemish science

Almost half of Flemish medical researchers have witnessed a form of scientific fraud in their direct environment. One in twelve have been engaged themselves in data fraud or in “massaging data” in order to make the results fit the hypothesis. Many mention “publication pressure” as an important cause of this behaviour. This is the outcome of the first public survey among Flemish medical researchers about scientific fraud. The survey was conducted in November and December 2012 by the journal Eos . Joeri Tijdink, who had conducted a similar survey in the Netherlands among medical professors supervised the Flemish survey.

It is not clear to what extent the survey results are representative of the conduct of all medical researchers in Flanders. The survey was distributed through the deans of medical faculties in the form of an anonymous questionnaire. The response rate was fairly low (19 % of the 2,548 researchers responded and 315 (12 %) filled it in completely). Yet, the results indicate that fraud may be a much more serious problem than is usually acknowledged in the Flemish scientific system. Since the installation of Flemish university committees on scientific integrity, no more than 4 cases of scientific misconduct have been recognized (3 involved plagiarism; 1 researcher committed fraud). This is clearly lower than expected. The survey, however, consistently reports higher incidence of scientific misconduct than comparable international surveys do. For example, having witnessed misconduct is reported by 14% of researchers according to a meta-study by Daniele Fanelli, but in Flanders this is 47%. Internationally, 2% of researchers admit to have been involved themselves in data massage or fraud, whereas in Flanders this is 8%. The discrepancy can be explained in two ways. One is that the university committees are not yet effective in getting out the truth. The other is that this survey is biased towards researchers who have witnessed misconduct in some way. Given that both explanations seem plausible, the gap between the survey results and the formal record of misconduct in Flanders may best be explained by a combination of both mechanisms. After all, it is hard to understand why Flemish medical researchers would be more (or less)  prone to misconduct than medical researchers in, say, the Netherlands, the UK, or France.

According to Eos, publication pressure is one of the causes of misconduct. This still remains to be proven. However, both in the earlier survey by Tijdink and Smulders, and in this survey, a large number of researchers mention “publication pressure” as a driving factor. As has been argued in the Dutch debate about the fraud by psychologist Diederik Stapel, the mentioning of “publication pressure” as a cause may be motivated by a desire for legitimation. After all, all researchers are pressured to publish on a regular basis, while a small minority is involved in misconduct (as far as we know now). So the response may be part of a justification discourse, rather than a causal analysis. My own intuition is that the problem is not publication pressure, but reputation pressure, a subtle but important difference. Nevertheless, if a large minority (47% of the Flemish respondents for example) of researchers point to “publication pressure” as a cause of misconduct, we may have a serious problem in the scientific system, whether or not these researchers are right. A problem that can no longer be ignored.

Literature:

Fanelli D (2009) How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data. PLoS ONE 4(5): e5738. doi:10.1371/journal.pone.0005738

Joeri K. Tijdink, Anton C.M. Vergouwen, and Yvo M. Smulders, Ned Tijdschr Geneeskd. 2012;156:A5715

Vacancy post-doctoral researcher

The Centre for Science and Technology Studies of the Faculty of Social Sciences of Leiden University wishes to announce a vacancy for the following position:

POST-DOCTORAL RESEARCHER (38 hours per week)

Vacancy number: 13-062

The Centre for Science and Technology Studies (CWTS)

The Centre for Science and Technology Studies (CWTS) is an interdisciplinary institute at Leiden University. Our research staff originates from many fields, varying from psychology, political science, literature studies and information science, to computer science, economics, physics and chemistry. We study the dynamics of science and its connections to technology and innovation. In other words, we study scientific and scholarly research from a scientific point of view. CWTS uses large databases that enable us to quantitatively discern the growth in scientific publications, patterns of collaboration, the impacts of science, and many other aspects of science such as scholarly communication and evidence-based performance assessment.

Our research is also used to provide high-quality services, via a university-owned company CWTS BV, to research institutes for evaluation of the impact of their publications and their standing in the international scientific community. In addition, we analyse the development of scientific careers, and the impact of research assessment on knowledge production, by way of mixed-methods research (including surveys and ethnographic methods).

Since 2012, we have focused our activities and interests within the framework of a new research program (www.cwts.nl/pdf/cwts_research_programme_2012-2015.pdf). CWTS has three chairs for full professors (Scientometrics; Science & Innovation studies; Science policy studies) as well as five working groups on key research themes (Advanced bibliometric methodologies; Evaluation practices in context; Social sciences and humanities; Scientific careers; Societal impact of research). The centre hosts a dynamic group of senior researchers and talented juniors who welcome collaboration with colleagues internationally and nationally. We can accommodate internships and provide students with supervision for Master’s and PhD theses.

Job description

We are inviting applications for a post-doctoral position in our new research program. The post-doctoral candidate is expected to carry out research in the context of the Evaluation Practices in Context (EPIC) working group at CWTS. This new line of research focuses on the implications of research assessment, and the performance criteria applied, for scientific and scholarly communication and knowledge production. The post-doc project will be drawn up in close consultation with prof.dr. Paul Wouters (Scientometrics chair) and dr. Sarah de Rijcke (EPIC working group leader). The post-doc will be encouraged to carry out comparative research with other EPIC group members. Results of the research will be disseminated through preparation of publications for a range of audiences.

Evaluation Practices in Context (EPIC)

The working group Evaluation Practices in Context (EPIC) examines the politics and practices of research evaluation in connection with contemporary forms of governance of research and scholarship. EPIC combines and contributes to theoretical frameworks and detailed empirical studies from Science and Technology Studies (STS) broadly defined (including scientometrics, and history, sociology and anthropology of science), organizational studies and higher education studies. The working group pays particular attention to the implications of research assessment, and the performance criteria applied, for scientific and scholarly communication and knowledge production. Important STS perspectives that we draw on have demonstrated that ‘science’ and ‘politics’ or ‘knowledge’ and ‘power’ should not be seen as separate spheres of action, but are involved in a constant process of mutual embedding and stabilization. Accordingly, our work analyzes the co-constitution of knowledge in relation to specific epistemic cultures, evaluation systems, publication practices, and governance contexts.

Profile post-doctoral researcher

We are looking for a prospective candidate with a PhD in the social sciences or humanities, preferably in science, technology and innovation studies or related fields (e.g. sociology, law, anthropology, political science, history of science, organizational studies, cultural studies). The candidate must have strong skills in designing, organizing and executing qualitative research, especially interviews and ethnographic fieldwork. Experience with computer-supported analysis (eg AtlasTI) is desirable but not necessary. Preference will be given to candidates with an academic drive who can provide clear evidence of, or potential for, international excellence in published research. The candidate should be able to work independently as well as cooperate in an interdisciplinary team. S/he should have verbal fluency in English and good written and verbal communication skills. Fluency in Dutch is considered an asset, but not a condition.

Appointment

We offer a temporary position as a researcher for a period of two years. Depending upon qualifications and experience, the gross monthly salary will be between €3227 and €4418 (scale 11), based on full time employment.

Benefits include pension contribution, annual holiday premium of 8% and an end-of-year premium of 8.3%. Non-Dutch nationals may be eligible for a substantial tax break (30% ruling).

Applicants should have the right to work in the Netherlands for the duration of the contract.

Additional Information

Further information about this position can be obtained from dr. Sarah de Rijcke, tel. +31 71 5276853 (office) or e-mail s.de.rijcke@cwts.leidenuniv.nl.

Application

Letters of application should be accompanied by a full curriculum vitae and two or three references.

Applications should reach the university by March 28, 2013 and can be sent electronically to our Human Resource Department at vacature@fsw.leidenuniv.nl.

When your application reaches us we will send you confirmation by e-mail. If you have not received a confirmation within three days after sending the e-mail, please phone us at +31 71 527 3427.

We will schedule interviews on the 3rd and 10th of April 2013.

Changing publication practices in the “confetti factory”

When do important reorientations or shifts in research agendas come about in scientific fields? A brief brainstorm led us to formulate three possible causes. First of all, a scarcity of resources can bring about shifts in research agendas, for instance on an institutional level (because research management decides on cutting the budgets of ill-performing research units). A second, related cause, are alignments of agendas through strategic (interdisciplinary) alliances, for the purpose of obtaining funding. A third cause for reconsideration of research agendas are situations of crisis, for instance those brought about by large-scale scientific misconduct or by debates on undesirable consequences of measuring productivity only in terms of number of articles.

Zooming in on the latter point: the anxiety over the consequences of a culture of ‘bean counting’ seems to be getting bigger. Unfortunately, solid analyses are rare that tease out these exact consequences for the knowledge produced. A recent contribution to the European Journal of Social Psychology does however offer such an analysis. In the article, and appropriating Piet Vroon’s metaphor of the ‘exploded confetti factory’, professor Naomi Ellemers voices her concern over the production of increasing amounts of gradually shorter articles in social psychology (a field in crisis), the decreasing amounts of references to books, and the very small 5-year citation window that researchers tend to stick to (cf. Van Leeuwen 2013). Ellemers laments the drift toward publishing very small isolated effects (robust, but meaningless), which leaves less and less room for ‘connecting the dots’, i.e. cumulative knowledge production. According to Ellemers, the current way to assess productivity and research standing has the opposite effect of leading to a narrowing of focus. Concentrating on amount of (preferably first-authored) articles in high impact journals does not stimulate social psychologists to aim for connection, but instead leads them to focus on ‘novelty’ and difference. A second way to attain more insight, build a solid knowledge base and generate new lines of research is through intra- and interdisciplinary cooperation, she argues. If her field really wants to tackle important problems in their full complexity – including the wider implications of specific findings – methodological plurality is imperative. Ellemers recommends that the field extends its existing collaborations – mainly with the ‘harder’ sciences – to also include other social sciences. A third way to connect the dots, and at least as important for ‘real impact’, is to transfer social-psychological insights to the general public:

“There is a range of real-life concerns we routinely refer to when explaining the focal issues in our discipline for the general public or to motivate the investment of tax payers’ money in our research programs. These include the pervasiveness of discrimination, the development and resolution of intergroup conflict, or the tendency toward suboptimal decision making. A true understanding of these issues requires that we go beyond single study observations, to assess the context-dependence of established findings, explore potential moderations, and examine the combined effect of different variables in more complex research designs, even if this is a difficult and uncertain strategy.” (Ellemers 2013, p. 5)

This also means, Ellemers specifies, that social psychologists perform more conceptual replications, and always specify how their own research fits in with and complements existing theoretical frameworks. It means that they should not refrain from writing meta-analyses and periodic reviews, and from including references to sources older than 10 years. This, Ellemers concludes, would all contribute to the goal of cumulative knowledge building, and would hopefully put an end to collecting unconnected findings, ‘presented in a moving window of references’.

What makes Ellemers’ contribution stand out is that she not only links recent debates about the reliability of social-psychological findings and ensuing ‘methodological fetishism’ to the current evaluation culture, but also that she doesn’t leave it at that. Ellemers subsequently outlines a research agenda for social psychology, in which she also argues for more methodological leniency, room for creativity and more comprehensive theory-formation about psychological processes and their consequences. Though calls for science-informed research management are also voiced in other fields and are certainly much needed, truly content-based evaluation procedures are very difficult to arrive at without substantive discipline-specific contributions like the one Ellemers provides.

Diversity in publication cultures II

As said in the previous post on the topic of diversity in publication cultures, the recent DJA publication, “Kennis over publiceren. Publicatietradities in de wetenschap”, presents interesting and valuable personal experiences. At the same time, the booklet tends to cut corners and make rather crude statements about the role of evaluation and indicators. Often, the individual life stories are not properly contextualized. For example, physicist Tjerk Oosterkamp claims that citation analysis is “not at all” appropriate for experimental physics. According to him, the use of citation scores in evaluation would encourage researchers to stick to “simple things” and shy away from more daring and risky projects. But is this true? Many initially risky projects attracted quite a lot of citations later. As far as I know, we do not yet have a lot of evidence about the effect of evaluations and performance indicators on risk behavior in science. We do indeed have some indications that researchers tend to avoid risky projects, especially in writing applications for externally funded projects. Yet, we do not know whether this means that researchers are taking less risks across the board.

Another objection is that citation patterns may reflect current fashions rather than the most valuable research. I think this is an important point. For example, the recent hype about graphene research in physics may prove to be less valuable than expected. Citations represent impact on the short term communication within the relevant research communities. This is different from long term impact on the body of knowledge. There is a relationship between the two types of impact, but they are certainly not identical.

A second example of cutting corners is the statement by the editors in one of the essays of the DJA publication that “there is not much support among scientists for bibliometric analysis (p. 25). Well, to be honest, this varies quite strongly. In many areas in the natural and biomedical sciences quantitative performance analysis is actually quite hot. Also, we see a tendency in the humanities and social sciences to try to find a cure for the lack of publication data in Google Scholar, which often, albeit not always, has a much better coverage of these areas. They are sometimes even willing to turn a blind eye to the quite considerable problems with the accuracy and reliability of these data. So, the picture is much more complicated than the image of bibliometrics being performed top-down on the unhappy researcher.

Notwithstanding these shortcomings, the DJA booklet presents important dilemmas and problems. Perhaps the legal scholar Carla Sieburgh presents the problem most clearly: quality can in the end only be judged by experts. However, there is no time to have external reviewers read all the material. Hence the shift towards measurement. But this tends to lead us away from the content. In every discipline, some solution of this dilemma needs to be found, probably by striking a discipline-specific balance between objectified analysis from outside and internalized quality control by experts. This search for the optimal balance is especially important in those fields where quality control has been introduced relatively recently.

%d bloggers like this: