Science mapping: do we know what we visualize?

A recent landmark in the field of science mapping is Katy Börner’s Atlas of Science: Visualizing What We Know (MIT Press, 2010). The atlas recently won the ASIS&T Best Information Science Book Award 2011.The kinds of maps covered by the atlas range from historical timelines, network diagrams and citation networks revealing rises in patent citations, to geographic maps, taxonomic hierarchies and maps of relative sizes and connectedness of scientific fields.

The advent of science mapping depends to a large extent on digitized indices of scholarly activity such as the Science Citation Index, and on advances in network analysis and visualization techniques. Bibliometric maps of scholarly activity are mostly based on bibliographic coupling, co-citation analyses or maps of keywords based on a co-occurrence network. The visualizations that are created are transformations of quantified data into visual form. The avalanche of bibliometric data incorporated in massive databases demand new visualization tools and – crucially – the skills to understand and engage with these new kinds of visualizations.

Most bibliometric mapping endeavors radiate an ambition on the part of the scientist(s) producing these maps to be synthetic, comprehensive and definitive. Börner’s Atlas of Science, for instance, is said to chart “the trajectory from scientific concept to published results,” revealing “the landscape of what we know.” However, maps are not a direct reflection of reality, all sorts of decisions are taken to process the data before they can be presented. While this may seem a matter ‘of course’, it does have consequences for the interpretation and  use of these maps.

For example, what often gets glossed over in these endeavors is that visualizations of scientific developments also prescribe how these developments should be known in the first place. Science maps are produced by particular statistical algorithms that might have been chosen otherwise, calculations performed on large amounts of ‘raw’ data , and for this reason they are not simply ‘statistical information presented visually’. The choice for a particular kind of visualization is often connected to the specificities and meaning of the underlying dataset and the software used to process the data. Several software packages have been specifically designed for this purpose (the VOSViewer supported by CWTS being one of them). These packages prescribe how the data should be handled. Different choices in selection and processing of the data will lead to sometimes strikingly different maps. Therefore, we will increasingly need systematic experiments and studies with different forms of visual presentation (Tufte, 2006).

At the same time, a number of interfaces are built into the mapping process, where an encounter takes place with a user who approaches these visualizations as evidence. But how do these users actually behave? To our knowledge hardly any systematic research is done on how users (bibliometricians, computer scientists, institute directors, policy makers and their staff, etc.) engage with these visualizations, and which skills and strategies are needed to engage with them. A critical scrutiny is needed of the degree of ‘visual literacy’ (Pauwels, 2008) demanded of users who want to critically work with and examine these visualizations. The visualizations technical or formal choices that determine what can be visualized and what will remain hidden. Furthermore, they are also shaped by the broader cultural and historical context in which they are produced.

Unfortunately there is a tendency to downplay the visuality of science maps, in favor of the integrity of the underlying data and the sophistication of transformation algorithms. However, visualizations are “becoming increasingly dependent upon technology, while technology is increasingly becoming imaging and visualization technology” (Pauwels 2008, 83). We expect that this interconnection between data selection, data processing and data visualization will become much stronger in the near future. These connections should therefore be systematically analyzed, while the field develops and experiments with different forms of visual representation.

Science mapping projects do not simply measure and describe scientific developments – they also have a normative potential. the director of a research institute wants to map the institute’s research landscape in terms of research topics and possible applications, and wants to see how the landscape develops over the next five years. This kind of mapping project, like any other description of reality, is not only descriptive but also performative. In other words, the map that gets created in response to this director’s question also shapes the reality it attempts to represent. One possible consequence of this hypothetical mapping project could be that the director decides on the basis of this visual analysis to focus more on certain underdeveloped research strands, at the expense of or in addition to others. The map that was meant to chart the terrain now becomes embedded in management decision processes. As a result, it plays an active part in a shift in the institute’s research agenda, an agenda that will be mapped in five years’ time with the same analytical means that were originally merely intended to describe the landscape.

A comparable example can actually be found in Börner’s book: a map that shows all National Institute of Health (NIH) grant awards from a single funding year., giving access to a database and web-based interface. The clusters on the map correspond to broader scientific topics covered in the grants, while the dots correspond with individual grants clustered together by a shared topical focus.

Here, too, it would be informative to analyze the potential role these maps play as policy instruments (for instance, in accountability studies). This type of analysis will be all the more urgent when bibliometric maps are increasingly used for the purposes of research evaluation. The maps created on the basis of bibliometric data do not simply ‘visualize what we know’. They actively shape bibliometric knowledge production, use and dissemination in ways that require careful scrutiny.

Teaching in Madrid

Started my visiting professorship at the Faculty of Library and Information Science, Complutense University in Madrid today with a nice class discussion about research evaluation. Here is the presentation I gave about the role of information science in research evaluation.

International networks start to drive research

Networks of collaborating scientists spanning the globe are increasingly shaping the research landscape. The share of papers co-authored by researchers from different countries is steadily growing. More than one third of the papers is now based on an international collaboration, up from one quarter fifteen years ago. On top of this, these internationally co-authored papers have a higher citation impact. Each foreign partner in a paper increases its potential to be cited up to a tipping point of approximately 10 countries. The dynamics of these international networks together with sustained investments in scientific research by an increasing number of countries produce a much more multipolar world. Not surprisingly, China is rising fast. Ranking countries on the number of scientific papers produced, China is now number 2 with a share of 10 % of the international scientific production. It is expected to become number 1 within a few decades. Brazil and India are also emerging as powerful players on the international scene. But the rise of new scientific centres is not restricted to the BRICS countries. In the Middle East, both Turkey and Iran are investing strongly with an enormous growth of authors and papers as a result. While Iran published a bit more than 700 papers in 1993, in 2008 this was already more than 13 thousand. Turkey published in 2008 four times as much as in 1996 and its number of researchers has grown by 43 %. Still, the current heavyweights are dominating the rankings based on citation numbers. With a decreasing share in total publications (down from 26 top 21 %), the United States still attracts the majority of citations, more than 30 % of all publications cite work originating in the United States. Chinese papers have significantly less impact: with 10 % of the share of papers, the Chinese collect only 3 % of the citations.

These are some of the highlights of the recent report of the Royal Society (UK), “Knowledge, Networks and Nations: Global scientific collaboration in the 21st century“. This report is based on an analysis of all papers in the Scopus database (Elsevier) published between 2004 and 2008, compared with the production between 1993 and 2003. The report combines these findings with five case studies of prominent international research initiatives in health research, physics, and climate research. I think this report is a goldmine of interesting facts and sometimes surprising developments and a must read for all science policy actors.

For European Science Policy makers, the report should moreover give pause for reflection. The fast rise of international networks is particularly relevant for Europe because of the rise of anti-immigration parties that currently have a big impact on policy in general, and thereby also on Science Policy. The share of internationally co-authored papers in the European countries is rising, which means that the researchers in Europe need to be supported in creating more international collaborations. This simply cannot be combined with an anti-immigration policy focused on blocking international exchange of scientific personnel. In Europe, very different from Asia, the general political climate therefore seems to be out of step with the developments in the world of science and scholarship. A creative Science Policy requires an open attitude eager for international exchange of ideas and people, not least also with colleagues in Turkey and Iran. And Turkey should become a member of the European Union as soon as possible.

The report also shows nicely that internationalization is not a simple process. Overall, the number of internationally co-authored papers is on the rise. And in the current scientific centres, this goes together with an increase of the share of international papers in the total national scientific production. But in China and Brazil, the share of international papers is decreasing, while the absolute number of internationally co-authored papers is rising. Turkey and Iran show comparable trends, albeit less clear.The explanation is that in these countries the national research capacity is building up faster than the growing international collaborations.

Do it yourself science mapping

Mapping Science has become popular lately. These maps show the relationships between scientific articles in various fields of research on the basis of their literature references and citations or on the basis of the common use of scientific terms in titles or abstracts. This type of atlasses have become possible due to new visusalization software tools and the availability of large amounts of data on scientific publications. The main suppliers of scientific information such as Thomson Reuters and Elsevier are creating new suites of products to tap into this development. At the same time, we also witness a surge in free software packages and scripts on the internet. These tools make it possible for individual researchers and managers to have a quick look at their fields and see how they relate to other fields. They are, by the way, not suitable for serious formal research evaluation.

At the beginning of December 2010, Science-Metrix, an evaluation company based in Montreal, published two new instruments on the web that can visualize the structure of scientific disciplines. The first of these, the interactive ‘ Scientific Journals Ontology Explorer’, visualizes the relationships among 175 scientific disciplines in 18 different languages in a global atlas of science. The second is a journal classification system which shows the relationships among 15 thousand journals. The disciplinary science map is based on the combination of data from the Web of Science (Thomson Reuters) and Scopus (Elsevier). This last product is a bit similar to SciVal, a product of Elsevier, which is also the visualization of scientific disciplines, though only based on the citation relationships among journals in the Scopus data set.

An alternative procedure has been followed by two Dutch research institutes: the Rathenau Institute and our CWTS. The maps of Science-Metrix and Elsevier are finished products and therefore static. The user can view them, but they cannot be changed. The Dutch institutes have each created tools with which researchers can make their own maps, on the basis of data sets they select themselves. This involves more work, but the end result is more transparent. The Rathenau Institute (Science Assessment Department) has created SAINT, a set of scripts to download of Web of Science data and make them suitable for processing by visualization software. They consist of a parser of bibliographic data, a splitter which extracts words from sentences, and a script that converts tables into matrices for visualization. This enables science mapping on the basis of either citation relationships among publications and/or the co-occurence of words in titles or abstracts.

The CWTS tool starts where the Rathenau tool ends. In January, CWTS published a new version of the VOSviewer ( This is a visualization tool in wich the interactive use of maps is central. The package makes it also easier to present dynamic developments, such as the emergence of new disciplines. The VOSviewer has so far been used mostly to map the relationships among fields of research, but in fact it can be used to visualize any set of network data. The SAINT matrix can for example be used as input file.

The mapping of science is not a new activity. The Belgian scholar Paul Otlet was one of the first to make maps of science, in the early twentieth century. Our colleague Katy Börner at Indiana University has played an important role in giving the field a new boost with her project “Atlas of Science” and the related exhibitions. It has not only produced a beautiful coffee table book, but convincingly shows the lay person the variety of maps that are now possible.

This is the English version of a Dutch article published in Onderzoek Nederland.

%d bloggers like this: