Science mapping: do we know what we visualize?

A recent landmark in the field of science mapping is Katy Börner’s Atlas of Science: Visualizing What We Know (MIT Press, 2010). The atlas recently won the ASIS&T Best Information Science Book Award 2011.The kinds of maps covered by the atlas range from historical timelines, network diagrams and citation networks revealing rises in patent citations, to geographic maps, taxonomic hierarchies and maps of relative sizes and connectedness of scientific fields.

The advent of science mapping depends to a large extent on digitized indices of scholarly activity such as the Science Citation Index, and on advances in network analysis and visualization techniques. Bibliometric maps of scholarly activity are mostly based on bibliographic coupling, co-citation analyses or maps of keywords based on a co-occurrence network. The visualizations that are created are transformations of quantified data into visual form. The avalanche of bibliometric data incorporated in massive databases demand new visualization tools and – crucially – the skills to understand and engage with these new kinds of visualizations.

Most bibliometric mapping endeavors radiate an ambition on the part of the scientist(s) producing these maps to be synthetic, comprehensive and definitive. Börner’s Atlas of Science, for instance, is said to chart “the trajectory from scientific concept to published results,” revealing “the landscape of what we know.” However, maps are not a direct reflection of reality, all sorts of decisions are taken to process the data before they can be presented. While this may seem a matter ‘of course’, it does have consequences for the interpretation and  use of these maps.

For example, what often gets glossed over in these endeavors is that visualizations of scientific developments also prescribe how these developments should be known in the first place. Science maps are produced by particular statistical algorithms that might have been chosen otherwise, calculations performed on large amounts of ‘raw’ data , and for this reason they are not simply ‘statistical information presented visually’. The choice for a particular kind of visualization is often connected to the specificities and meaning of the underlying dataset and the software used to process the data. Several software packages have been specifically designed for this purpose (the VOSViewer supported by CWTS being one of them). These packages prescribe how the data should be handled. Different choices in selection and processing of the data will lead to sometimes strikingly different maps. Therefore, we will increasingly need systematic experiments and studies with different forms of visual presentation (Tufte, 2006).

At the same time, a number of interfaces are built into the mapping process, where an encounter takes place with a user who approaches these visualizations as evidence. But how do these users actually behave? To our knowledge hardly any systematic research is done on how users (bibliometricians, computer scientists, institute directors, policy makers and their staff, etc.) engage with these visualizations, and which skills and strategies are needed to engage with them. A critical scrutiny is needed of the degree of ‘visual literacy’ (Pauwels, 2008) demanded of users who want to critically work with and examine these visualizations. The visualizations technical or formal choices that determine what can be visualized and what will remain hidden. Furthermore, they are also shaped by the broader cultural and historical context in which they are produced.

Unfortunately there is a tendency to downplay the visuality of science maps, in favor of the integrity of the underlying data and the sophistication of transformation algorithms. However, visualizations are “becoming increasingly dependent upon technology, while technology is increasingly becoming imaging and visualization technology” (Pauwels 2008, 83). We expect that this interconnection between data selection, data processing and data visualization will become much stronger in the near future. These connections should therefore be systematically analyzed, while the field develops and experiments with different forms of visual representation.

Science mapping projects do not simply measure and describe scientific developments – they also have a normative potential. the director of a research institute wants to map the institute’s research landscape in terms of research topics and possible applications, and wants to see how the landscape develops over the next five years. This kind of mapping project, like any other description of reality, is not only descriptive but also performative. In other words, the map that gets created in response to this director’s question also shapes the reality it attempts to represent. One possible consequence of this hypothetical mapping project could be that the director decides on the basis of this visual analysis to focus more on certain underdeveloped research strands, at the expense of or in addition to others. The map that was meant to chart the terrain now becomes embedded in management decision processes. As a result, it plays an active part in a shift in the institute’s research agenda, an agenda that will be mapped in five years’ time with the same analytical means that were originally merely intended to describe the landscape.

A comparable example can actually be found in Börner’s book: a map that shows all National Institute of Health (NIH) grant awards from a single funding year., giving access to a database and web-based interface. The clusters on the map correspond to broader scientific topics covered in the grants, while the dots correspond with individual grants clustered together by a shared topical focus.

Here, too, it would be informative to analyze the potential role these maps play as policy instruments (for instance, in accountability studies). This type of analysis will be all the more urgent when bibliometric maps are increasingly used for the purposes of research evaluation. The maps created on the basis of bibliometric data do not simply ‘visualize what we know’. They actively shape bibliometric knowledge production, use and dissemination in ways that require careful scrutiny.


One Response to “Science mapping: do we know what we visualize?”

  1. Andrea Scharnhorst Says:

    As recent experiments with delineation of communities around topics show, the h-index community in LIS journals also behaves a bit different as other communities, forming a clique and ‘seeing’ other LIS communities rather as one unstructured ‘outside’ than as other separable communities. Best to discussed further with the authors (Frank Havemann, Michael Heinz, Alexander Struck, Jochen Gläser) of a paper on Identification of Overlapping Communities by Locally Calculating Community-Changing Resolution Levels ( See also the project page of the interesting project “Measuring diversity of research” (

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: