• No results found

Measuring the interdisciplinarity of research topics

N/A
N/A
Protected

Academic year: 2021

Share "Measuring the interdisciplinarity of research topics"

Copied!
10
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

STI 2018 Conference Proceedings

Proceedings of the 23rd International Conference on Science and Technology Indicators

All papers published in this conference proceedings have been peer reviewed through a peer review process administered by the proceedings Editors. Reviews were conducted by expert referees to the professional and scientific standards expected of a conference proceedings.

Chair of the Conference Paul Wouters

Scientific Editors Rodrigo Costas Thomas Franssen Alfredo Yegros-Yegros

Layout

Andrea Reyes Elizondo Suze van der Luijt-Jansen

The articles of this collection can be accessed at https://hdl.handle.net/1887/64521 ISBN: 978-90-9031204-0

© of the text: the authors

© 2018 Centre for Science and Technology Studies (CWTS), Leiden University, The Netherlands

This ARTICLE is licensed under a Creative Commons Atribution-NonCommercial-NonDetivates 4.0 International Licensed

(2)

Qi Wang* and Per Ahlgren**

*qiwang@kth.se

KTH Library, KTH Royal Institute of Technology, Osquars backe 25, Stockholm, 100 44 (Sweden)

** perahl@kth.se

KTH Library, KTH Royal Institute of Technology, Osquars backe 25, Stockholm, 100 44 (Sweden)

Introduction

Measuring interdisciplinarity is an essential but challenging task. Previous explorations of ways to assess the degree of interdisciplinarity have been conducted at different levels of analyses, for instance, at the level of single papers (e.g., Uzzi et al., 2013; Wang et al., 2015; Yegros- Yegros et al., 2015), individual researchers (e.g., Gowanlock & Gazan, 2013), journals (e.g., Zhang et al., 2015; Rodríguez, 2017), and countries (e.g., Digital Science, 2016). Apart from these studies, there have also been attempts to measure interdisciplinarity at the level of research topics; for instance, Rafols and Meyer (2010) analyze molecular motors using diversity and network coherence indicators, and Morooka et al. (2014) investigate the structure of interdisciplinarity in Japanese rice research and technology. A quite recent work of Mugabushaka et al. (2016) replicates a case study conducted by Rafols and Meyer (2010) using the Leinster-Cobbold diversity index.

However, to our knowledge, most analyses regarding the measurement of the interdisciplinarity of research topics2 have been limited to one or a few predefined topics. Studies of interdisciplinarity with regard to the entire science system at the level of topics are rarely performed. The present study aims to address this limitation by identifying interdisciplinary topics from the entire science system. As suggested by Wagner et al. (2011), explorations of the clusters constructed by publications using citations or term relations can capture “emerging developments that do not fit into existing categories” (p. 23). Therefore, explorations at the level of topics, taken as algorithmically generated clusters, can offer insight into emerging interdisciplinary topics.

Before moving on to the next section, interdisciplinary research should be defined. In this study, we use the definition put forward by the Committee on Facilitating Interdisciplinary Research and Committee on Science (CFIRCS) (2005):

[i]nterdisciplinary research (IDR) is a mode of research by teams or individuals that integrates information, data, techniques, tools, perspectives, concepts, and/or theories from two or more disciplines or bodies of specialized knowledge to advance

1 This work is based on a manuscript appearing in the doctoral dissertation of Qi Wang. The authors would like to thank Ludo Waltman and Ismael Rafols for valuable comments on that manuscript.

2 In the remainder of this paper, we use the term “topic” instead of “research topic”

(3)

STI Conference 2018 · Leiden

fundamental understanding or to solve problems whose solutions are beyond the scope of a single discipline or area of research practice. (p. 2)

In this work, and with respect to topics, we try to capture the indicated concept of interdisciplinarity by using Web of Science (WoS) subject categories (SCs) as disciplines, as well as citation relations between these categories. For instance, if a topic contains publications in the two distant WoS SCs “Radiology Nuclear Medicine & Medical Imaging” and

“Environmental Sciences”, and the topic publications assigned to the former category are closely related in terms of citations to the topic publications assigned to the latter, then the topic can be considered as interdisciplinary.

Data and methods

With the aim of obtaining clusters corresponding to topics, we used an algorithmically generated (hierarchical) publication-level classification constructed at CWTS (Leiden University) by application of a methodology proposed by Waltman and Van Eck (2012; 2013).

Publications are clustered on the basis of direct citation links between them, and the clustering technique used is similar to modularity-based clustering (Newman 2004a, 2004b). Content labels for a cluster are obtained from titles and abstracts of the publications belonging to the cluster. The classification contains about nine million publications and 9,565 clusters. The publications are recorded in WoS, of the document types Article or Review, published within the 10-year period 2002-2013, appear in 15,483 different journals, and are assigned to 250 different WoS SCs. A similar classification has been used in earlier research in order to identify topics within nanocellulose research (Milanez et al., 2016).

We consider two WoS SCs to be related if their citing profiles are similar. In view of this, we constructed a symmetric matrix S with category-category similarities (sij values) based on citation relations. In the construction process, we first obtained, for each pair of a citing category i and a cited category j, the number of citations from publications assigned to i to publications assigned to j, cij. This gave rise to a square, non-symmetric matrix C, populated by the cij values.

A row i in C gives the citing profile of category i. A fractional approach was used to obtain the cij values, since a journal, and thereby its publications, might be assigned to multiple categories.

For instance, if a publication assigned to two categories cites a publication assigned to three categories, each of the six combinations of citing and cited category was assigned the weight 1/(2 x 3) = 1/6.

With C at hand, S was constructed by applying the cosine measure (Salton & McGill, 1983) to the pairs of rows (i.e., vectors) in C. Figure 1 illustrates how S was constructed. The upper left part corresponds to C, whereas the lower right part corresponds to S. Finally, we generated a dissimilarity matrix D from S by subtracting the sij values of S from 1, since category-category dissimilarities were needed for the proposed interdisciplinary indicator (cf. next subsection).

We let dij denote the dissimilarity value for the categories i and j.

(4)

Figure 1: Illustration of the construction of S.

Measuring the interdisciplinary of a topic

Our definition of the interdisciplinarity of topics, which in this study are regarded as clusters of the used CWTS classification, takes into account the degree of dissimilarity regarding the pairs of WoS SCs involved in the publications of a topic, and the strength of the within-topic citation relations for these pairs. Note that our approach can be applied in any situation in which two classifications of the same set of objects are given. We define the interdisciplinarity of a topic T, I(T), as follows:

1 1

( ) 1

m m

T ij ij T

i j

I T c d

c



, (1)

where m = 250, the number of WoS SCs with respect to the used CWTS classification,cijTthe number of citations from publications inT assigned to category i to publications in T assigned to category j, and c the sum of theT cijTvalues. The indicator, which theoretically takes values on the interval [0, 1] and which resembles the Rao-Stirling diversity index (e.g., Porter & Rafols, 2009), can be considered as the weighted average of the dissimilarity valuesdij, where the weights are the strengths of the corresponding, within-topic citation flows. Note that the contribution to the indicator value of a pair (i, i) is zero, since dii is zero, and that dijdji. Topic filtering

The number of topics (clusters) of the used classification is, as mentioned above, 9,565. On average, a topic contains 956 publications. The largest topic contains 10,744 publications, whereas the smallest topic contains one (1) publication. Clusters with only a few publications are meaningless for this analysis, and we therefore excluded all topics with less than 100 publications.

(5)

STI Conference 2018 · Leiden

The topics that should be included in the analysis need be coherent. In order to select coherent topics, we divided the number of citations to the publications in a topic with the number of publications in the topic. If the resulting average was less than 2, we considered the topic publications to be loosely connected, and excluded the topic from the analysis. The rationale behind this approach relates to the fact that topics are created on the basis of direct citations between publications.

As reported in Table 1, the number of topics was reduced from 9,565 to 7,864 by the two-step filtering process described above.

Table 1. Descriptive statistics for publications and topics, before and after filtering.

Before filtering After filtering

No. of publications 9,146,302 8,930,360

No. of topics 9,565 7,864

Avg. no. of pubs 956 1135

Max no. of publications 10,744 10,744

Min no. of publications 1 100

Robustness of the matrix S

Figure 2 provides the distribution of pairs of WoS SCs over the similarity values appearing in the matrix S. Clearly, the distribution is highly positively skewed, with a large proportion of the pairs having small values.

Figure. 2: Distribution of pairs of WoS SCs over similarity values.

We tested the robustness of the similarity values of S by comparing them to similarity values obtained from applying the cosine measure to the columns of the matrix C. That is, we used the cited profiles of the categories instead of the citing ones. It turned out that the correlation between the two sets of similarity values was high (r = 0.74), and we therefore consider S to be robust.

(6)

Results

In this section, we first briefly describe the distribution of topics over interdisciplinarity values, and then put forward the results of a case study, which were performed for validation purposes.

The mean interdisciplinarity value, i.e., the mean I value, across the 7,864 topics is 0.42 with a standard deviation of 0.11. In Figure 3, the corresponding distribution is shown. As can be seen in the figure, the majority of topics have I values between 0.35 and 0.55.

Figure. 3: Distribution of topics over interdisciplinarity values.

We note that there is no clear cut-off between interdisciplinary and non-interdisciplinary topics.

For the case study, we regarded topics with I values above the 99th percentile (0.61) as interdisciplinary topics.

The case study concerns the WoS SC “Information Science & Library Science”. The reason that we selected this category is that we have expert knowledge in information science. We first identified topics, among the ones considered as interdisciplinary, with more than 10% of their publications assigned to this category, and then selected the topic with the highest I value. This topic, with ID 4982, is ranked 72 when the topics are ranked according to I values.

Table 2 provides descriptive information on topic 4982. The topic contains 565 publications.

The table reports first author, title and full citation count for the two most frequently cited publications in the topic, and the three most frequently occurring WoS SCs for the topic. The most frequently occurring category is “Computer Science, Information Systems”, with

“Business” ranked second.

Table 2. Descriptive information on topic 4982.

ID Descriptive information 4982 Total no. of pubs 565

Rank 72

(7)

STI Conference 2018 · Leiden

WoS SC and no. of pubs

Computer Science, Information Systems (141);

Business (108);

Information Science & Library Science (107) First author, title

and full citation count

Malhotra N.K. et al. (2004). Internet users' information privacy concerns (IUIPC): The construct, the scale, and a causal model (169);

Nissenbaum H. (2004). Privacy as contextual integrity (110) The network map of Figure 4 was obtained in the following way.3 First, a baseline map adopted from Zahedi and Van Eck (2014) was used. In this map, the nodes correspond to WoS SCs.

Nodes with a high association strength (the node-node similarity measure implemented in VOSviewer), based on citation relations, are pulled towards each other, whereas nodes with a low strength are pushed away from each other (Waltman et al., 2010). Further, the nodes are clustered, based on the association strength values, by a technique similar to modularity-based clustering. Nodes belonging to the same cluster have the same color. Next, the baseline map was overlaid with our within-topic data. This yielded nodes with sizes proportional to the number of publications (fractional counts) assigned to the corresponding category with regard to the topic, and links, restricted to node pairs corresponding to two categories present in the topic, with weights equal to, for categories i and j, the sum of the cijT and cTji values (cf. the section on data and methods).

The importance of the categories “Computer Science, Information Systems” and “Business”

(Table 2) for topic 4982 is reflected in the map of Figure 4. The map shows that the topic covers disciplines like ethics, ergonomics, law, and psychology and that these disciplines are related, in terms of citations, to computer science and business. Moreover, the computer science fields, clustered together, appear on the map quite far from, for instance, law and psychology, also clustered together. This means that the latter are fairly dissimilar to the former according to association strength. The titles of the two most frequently cited publications within the topic (Table 2) suggest that information privacy is treated by publications in the topic. This was confirmed from browsing the titles of the publications. We have the general idea that research on information privacy combines ideas, notions, and theories from information science, computer technology, and social sciences (like law and psychology). The map of Figure 4 is in line with this idea.

To further validate the interdisciplinarity of topic 4982, we searched courses on information privacy in the MIT OpenCourseWare, using “information privacy” as search term. Among the retrieved course titles were “The Economics of Information, Communications and Information Policy” and “Biomedical Computing, Information and Entropy”. This further demonstrates, to some extent, that information privacy is interdisciplinary in character.

3 We used the program VOSviewer (Van Eck & Waltman, 2010) for network visualization.

(8)

Figure 4: Map of cluster 4982.

Conclusions

This study measures interdisciplinarity of research at the level of topics proposing an indicator designed on the basis of the interdisciplinary definition suggested by the CFIRCS. The major contribution of this work is that application of its methodology might provide insight into current interdisciplinary topics. Interdisciplinary practices are not likely to take place in a certain journal or a predefined category. Generally, it is not very meaningful to talk about research contributions being made to the advancement of knowledge at the level of disciplines (Chubin, 1973). Hence, it is of importance to explore interdisciplinarity at lower levels of science than the discipline level.

We need to mention four limitations of our study. Firstly, due to the space restrictions, only one validation case study was performed. More such studies are clearly needed. Secondly, with respect to our validation case study and potential future such studies, the interdisciplinarity of a topic should ideally be characterized in a way that is more independent from our definition of topic interdisciplinarity. For instance, the associations between the WoS SCs in a base map could be based on textual relations instead of citation relations. Thirdly, the dissimilarity component of the interdisciplinarity indicator I (Eq. 1) is based on fractional counts of citations between WoS SCs, whereas the within-topic similarity component is based on full counts. This is not optimal. Finally, the accuracy of the WoS journal classification scheme can be questioned (Wang & Waltman, 2016). Therefore, studying interdisciplinarity of topics without the use of any classification scheme is an appealing idea.

For future research on the interdisciplinarity of topics, we would like to apply a more sophisticated approach to topic identification, recently suggested by Sjögårde and Ahlgren (2018). We also intend to address the mentioned limitations of the study.

References

Chubin, D. E. (1973). On the use of the “Science Citation Index” in sociology. The American Sociologist, 8(4), 187-191.

(9)

STI Conference 2018 · Leiden

Committee on Facilitating Interdisciplinary Research, National Academies of Sciences, National Academies of Engineering, Institute of Medicine. (2005). Facilitating Interdisciplinary Research. Washington: National Academies Press.

Digital Science (2016). Interdisciplinary research: methodologies for identification and assessment. Available online: https://www.mrc.ac.uk/documents/pdf/assessment-of- interdisciplinary-research/

Gowanlock, M., & Gazan, R. (2013). Assessing researcher interdisciplinarity: A case study of the University of Hawaii NASA Astrobiology Institute. Scientometrics, 94(1), 133-161.

Milanez, D. H., Noyons, E., & de Faria, L. I. L. (2016). A delineating procedure to retrieve relevant publication data in research areas: the case of nanocellulose. Scientometrics, 107(2), 627-643.

Morooka, K., Ramos, M. M., & Nathaniel, F. N. (2014). A bibliometric approach to interdisciplinarity in Japanese rice research and technology development. Scientometrics, 98(1), 73-98.

Mugabushaka, A. M., Kyriakou, A., & Papazoglou, T. (2016). Bibliometric indicators of interdisciplinarity: the potential of the Leinster–Cobbold diversity indices to study disciplinary diversity. Scientometrics, 107(2), 593-607.

Newman, M.E.J. (2004a). Fast algorithm for detecting community structure in networks.

Physical Review E, 69(6), 066133.

Newman, M.E.J. (2004b). Analysis of weighted networks. Physical Review E, 70(5), 056131.

Porter, A., & Rafols, I. (2009). Is science becoming more interdisciplinary? Measuring and mapping six research fields over time. Scientometrics, 81(3), 719-745.

Rafols, I., & Meyer, M. (2010). Diversity and network coherence as indicator of interdisciplinarity: case studies in Bionanosciene. Scientometrics, 82(2), 263-287.

Rodríguez, J. M. (2017). Disciplinarity and interdisciplinarity in citation and reference dimensions: knowledge importation and exportation taxonomy of journals. Scientometrics, 110(2), 617-642.

Salton, G., & McGill, M.J. (1983). Introduction to Modern Information Retrieval. New York:

McGraw-Hill, Inc.

Sjögårde, P., & Ahlgren, P. (2018). Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics. Journal of Informetrics, 12(1), 133-152.

Uzzi, B., Mukherjee, S., Stringer, M., & Jones, B. (2013). Atypical combinations and scientific impact. Science, 342(6157), 468-472.

Van Eck, N.J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for

(10)

bibliometric mapping. Scientometrics, 84(2), 523-538.

Wagner, C.S., Roessner ,J.D., Bobb, K., Klein, J.T., Boyack, K.W., Keyton, J., Rafols, I., &

Borner, K. (2011) Approaches to understanding and measuring interdisciplinary scientific research (IDR): A review of literature. Journal of Informetrics, 5(1), 165, 14-26.

Waltman, L., & Van Eck, N.J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of the American Society for Information Science and Technology, 63(12), 2378-2392.

Waltman, L., & Van Eck, N.J. (2013). A smart local moving algorithm for large-scale modularity-based community detection. European Physical Journal B, 86, 471.

Waltman, L., Van Eck, N. J., & Noyons, E. C. (2010). A unified approach to mapping and clustering of bibliometric networks. Journal of Informetrics, 4(4), 629-635.

Wang, J., Thijs, B., & Glänzel, W. (2015). Interdisciplinarity and impact: Distinct effects of variety, balance, and disparity. PLOS ONE, 10(5), e0127298.

Wang, Q. & Waltman, L. (2016). Large-scale comparison between the journal classification systems of Web of Science and Scopus. Journal of Informetrics, 10(2), 347-364.

Yegros-Yegros, A., Rafols, I., & D’Este, P. (2015). Does interdisciplinary research lead to higher citation impact? The different effect of proximal and distal interdisciplinarity. PLOS ONE, 10(8), e0135095.

Zahedi, Z., & Van Eck, N. J. (2014). Visualizing readership activity of Mendeley users using VOSviewer. In altmetrics14: Expanding impacts and metrics, Workshop at Web Science Conference.

Zhang, L., Rousseau, R., & Glänzel, W. (2015). Diversity of references as an indicator of the interdisciplinarity of journals: Taking similarity between subject fields into account. Journal of the Association for Information Science and Technology, 67(5), 1257-1265.

Referenties

GERELATEERDE DOCUMENTEN

Note: The dotted lines indicate links that have been present for 9 years until 2007, suggesting the possibility of being active for 10 years consecutively, i.e.. The single

“An analysis of employee characteristics” 23 H3c: When employees have high levels of knowledge and share this knowledge with the customer, it will have a positive influence

7 Schematic picture of a Receptor Tyrosine Kinase situated on the cell wall with its Binding domain facing the interstitial fluid compartment and the Kinase domain facing the

The working fluid used was R134a and the correlations are given for the evaporator and condenser inside heat transfer coefficients as well as for the maximum heat transfer rate..

The objective of this study is to explore how research objects can serve as a bridge between disciplines and specialties in the social sciences and humanities and to therefore

By using links between 52,097 grants and tens of thousands of topics, we will test whether large topics get more (than expected) funding from large grants and, by inference,

In de derde plaats zouden kanalen meer naar elkaar moeten verwijzen om ervoor te zorgen dat de burger die uit gewoonte een kanaal kiest toch daar geholpen wordt waar deze het beste

Dimitris Dalakoglou, “An Anthropology of the Road” (Ph.D. diss., University College Lon- don, 2009); Dimitris Dalakoglou, The Road: An Ethnography of (Im)mobility, Space and