Complexity

Complexity / 2020 / Article

Research Article | Open Access

Volume 2020 |Article ID 3154619 | https://doi.org/10.1155/2020/3154619

Jian Zhou, Lin Feng, Ning Cai, Jie Yang, "Modeling and Simulation Analysis of Journal Impact Factor Dynamics Based on Submission and Citation Rules", Complexity, vol. 2020, Article ID 3154619, 17 pages, 2020. https://doi.org/10.1155/2020/3154619

Modeling and Simulation Analysis of Journal Impact Factor Dynamics Based on Submission and Citation Rules

Academic Editor: Michele Scarpiniti
Received13 Nov 2019
Revised12 Apr 2020
Accepted20 Jun 2020
Published26 Aug 2020

Abstract

The variation of the journal impact factor is affected by many statistical and sociological factors such as the size of citation window and subject difference. In this work, we develop an impact factor dynamics model based on the parallel system, which can be used to analyze the correlation between the impact factor and certain elements. The parallel model aims to simulate the submission and citation behaviors of the papers in journals belonging to a similar subject, in a distributed manner. We perform Monte Carlo simulations to show how the model parameters influence the impact factor dynamics. Through extensive simulations, we reveal the important role that certain statistics elements and behaviors play to affect impact factors. The experimental results and analysis on actual data demonstrate that the value of the JIF is comprehensively influenced by the average review time, average number of references, and aging distribution of citation.

1. Introduction

Academic impact assessment and scientific journal ranking have always been a hot topic, which plays a very key role in the process of dissemination and development of academic research [13]. As one of the most important evaluation indicators in SCI, journal impact factor (JIF) is calculated by the scientific division of Clarivate Analytics® and commonly used to rank and evaluate the grades of various scientific journals in the Journal Citation Report® (JCR) database [4]. Since the introduction of the JIF, a growing stream of studies have discussed the mechanism, characteristics, and applications, as well as limitations and misuses, particularly in recent years. JIF first aims at evaluating scientific journals, but it is now increasingly used to assess research and guide publishing strategies of researchers and institutions. In this respect, JIF has gradually become an important indicator to measure the quality or reputation of a journal [5, 6]. Some publishers, for example, deem impact factor values as an indirect marketing tool for selling their journals. In [6], Larivière and Gingras argued that the JIF not only reflects the “quality” of a paper but also represents the reputation of the journal in which it is published because the possibility of citation to a paper is also significantly affected by the impact factor of the journal. In addition, JIF is often regarded as an important reference that can be relied on by most authors when they are preparing for submission [7]. A major weakness of the JIF, however, is that the two-year citation cycle of the JIF is considered too short to reflect the real academic impact of publications in some “slow” disciplines [8, 9]. The concept of JIF was first introduced by Garfield in 1955 [10], and its calculation can be formulated as the following form:where denotes the JIF of the kth year; denotes the number of papers published in the year; and denotes the number of those citations received during the kth year by the papers published in the year.

According to the definition of the JIF, it can be found that the value of the JIF in the kth year is computed by and in and years. In reality, however, the variation of the JIF is affected by many statistical and sociological factors. For instance, the value of the JIF would be strongly influenced by the research field of the journal and the type of the journal (full papers, letters, and reviews) [11, 12]. JIF is calculated via the number of references; thus, it could also be improved by increasing the number of references [1214]. In [15], Zhang and Van Poucke suggested that journals with short publication delay tend to receive higher impact factors. Yu et al. developed a transfer function model to simulate the distributed citation process [16]. However, such a transfer function model does not consider the differences of citation behavior in different disciplines. Notwithstanding some researchers pay great attention to the significant difference in the levels of impact factors over different disciplines, they often neglect the results affected by the process of submission and citation [9, 12]. Clearly, scientific publication process and JIF need to be studied as a unified system, particularly as a complex system [1719]. Scientific community, for example, in its most typical form, can be modeled as a complex system where researchers interact with each other taking the roles of authors, journal editors, and reviewers. Computer simulations are able to reproduce certain simple behaviors, and therefore, they can be used to model and reveal some correlations that are very difficult or impossible to be studied in real life.

Random theory based on probability has long been the dominating methodology in the domain of scientometrics [2022]. As a typical method of social system simulation, Monte Carlo method is very suitable for modeling such issues. In contrast to other approaches, the key idea of the method here is primarily based on distributed social computing. In this paper, we employ parallel social systems [2326] to simulate and analyze the correlation between the variation of the impact factor and the behavior of submission and citation, which can be used to interpret the JIF dynamics. In essence, the main objective of the parallel system is prediction for analysis and control. Simulation models are used to generate a lot of empirical data by setting various conditions and tuning related parameters, while statistical analysis is to perform mathematical statistics and analysis on existing data and information, which can also be applied for prediction. The fundamental difference between statistical analysis and simulation analysis is data volume rather than prediction. Bibliometric indicators are quantitative measures of science based on the publication and citation data. They are characterized by a quantitative approach and evaluation scales, which can be macro, meso, or micro, and reveal the scientific performance of a particular field over time. This paper aims to address these problems by using the dynamic modeling approach of the system. The demand for the model is completely different when the model is designed for prediction or theoretical analysis. For the prediction systems, various possibilities should be taken into account as much as possible, and the hyperparameters of the model should be tuned based on the dataset observed from the real world. By contrast, for theoretical analysis, the system should be as simple as possible to grasp merely those prominent features. Thus, our model only considers some basic social factors, with the idiosyncrasy of individuals embodied by random noises. To the best of our knowledge, most of the parallel social systems are composed of individuals and communities, which are agent-based and networked [2732], with some of the analytical properties potentially acquirable by relevant theories in system science [26, 33]. It is worth nothing that the primary goal of virtual simulation systems is not for mimicking the real-world counterparts quantitatively, instead, they should be very helpful for verifying, interpreting, and enlightening the underlying factors and the possibility of some of the phenomena and mechanisms, qualitatively. Such a framework has already been widely and effectively applied for analyzing various complex social systems, e.g., [3438]. Moreover, the virtual citation networks generated by the model here are well compatible with the most typical scale-free networks [34, 39] (please see Section 2.3 for detailed explanation). We hope our research could provide meaningful theoretical hints and enrich the relevant literature studies for better comprehending the mechanism of JIF dynamics and further aiding to facilitate enhancement in managing academic journals. The key contributions of this work can be summarized as follows:(i)We employ parallel social system theory to interpret the mechanism of JIF dynamics(ii)We develop an empirically driven parallel experiment framework that analyzes interactions between JIF and submission and citation rules (see Section 2)(iii)The correlation between JIF and some elements and behaviors that are usually ignored is revealed via simulation experiments

The rest of this paper is organized as follows. Section 2 introduces the main framework of the model in detail. The relations between JIF and certain elements and behaviors are revealed and analyzed in Section 3. Finally, Section 4 presents the concluding remarks.

2. Model Construction

In this study, we consider the paper submission process, citation process, and JIF to construct a comprehensive system. We develop a virtual citation community in which authors submit and cite papers, and journals review and publish papers. Simultaneously, the model will record those already published and cited papers and compute automatically the impact factors of the corresponding journals based on the citation distribution. The simulation model is discrete-timed, and the unit of time is months, with each iteration representing a round of submission, publication, and citation. The model is implemented in MATLAB (MATLAB and Statistics Toolbox Release 2018b, The MathWorks, Inc., Natick, Massachusetts, United States). The program code can be found in https://github.com/pjzj/JIF-Modeling.

2.1. Model Initialization

The first stage is setting the parameters of the simulation model, where general knowledge of the system is considered, namely,(1)The number of journals ().(2)The number of issues of a journal published per year (), and each month corresponds to each issue.(3)The number of papers published per issue (). According to and , the number of papers published per journal within one year can be computed by .(4)The average review time of journals (). This parameter denotes the length between two time periods, which are the time of a paper being submitted and the time of the paper being accepted, respectively. For instance, the submitted date of a paper (no. 0826) is December 2018; then, the accepted date of the paper is October 2019, (months), and represents the average value of published in the journal (see Appendix Figure 1). It should be explained that the average review time () is a macrostatistical indicator, which reflects an overall level of process speed, being different journal by journal. Furthermore, for simplicity, assume that publication occurs instantly as a paper is accepted in the model.(5)The average number of references per paper in a journal ().(6)Here, we define and assign some relevant parameters such as , , and because each of the subsystems can be parameterized independently (see Sections 2.2 and 2.3 for more details).

Note that, in the current model, similar disciplines can be represented by identical settings, while different disciplines are differentiated via tuning the parameters above. In particular, the calculations of impact factors between journals and their series (for example, Nature and Nature Cell Biology) are also independent of each other. Moreover, according to the definition of the impact factor which considers the correlation between the number of papers a journal published in the previous two years and the total number of citations, we uniformly set the impact factor in the first two years to 1 that could avoid certain undesirable situations in the citation process.

2.2. Modeling of Submission Process

The second stage is the modeling of the submission process, which mainly consists of the following three steps:(1)Characterization of papers:

The variation in paper quality is an objective fact, which is somewhat similar to the situation of examination scores in education. However, paper quality is difficult to be measured practically. For such situations, people generally assess through scoring questionnaire manner. Existing typical instances include publication scores (https://publons.com/), which are generated by selecting a score from 1 to 10 in two fields, jointly indicating the measure of a paper’s methodology, rigor, and novelty, as well as relevance to its field. Following the routine, in our model, the intrinsic quality of the paper is scored by number , with 0 being the worst and 10 being the best, and the overall quality of all papers follows a skewed distribution with certain expectation and variance. According to the expectation and variance of the distribution, the model will create corresponding numbers of papers with different scores, automatically (shown in Appendix Figure 2). In reality, due to the fact that the quality of the majority of papers is mainly mediocre-leveled, those papers with extremely high or low quality would be relatively less. Therefore, we assume that the quality of each paper is drawn from positively skewed distribution over the interval , which is implemented by gamma function. The basis for doing this is that, as the number of papers () keeps on increasing, the papers with ultrahigh quality become rare, and their number holds relatively stable, with only minor increase if above a limit. To ensure that the papers with high quality remain relatively constant regardless of changes of , a numerical integration with respect to gamma function over the interval is conducted in the model, which is computed by the following form:where (set at 0.92) is the threshold, is the number of papers with high quality, and is the gamma function.(2)Journal targeting process:

In general, the authors would give priority to those journals with higher reputation in the field when they are preparing to select journals for submission [5, 6, 40]. Of course, although the high-quality papers covering platelet function in vitro have a great impact in this scientific subdomain, they will never be accepted or published in high-ranked journals, which can be attributed to a fact that the interest for the readers of some top journals such as Nature and Science is not comparable with the technical breakthrough at a microlevel for specific journals dealing with platelets. Therefore, for certain specific topics, the papers with higher quality are usually submitted to the journals that have the highest ranking covering a specific topic. In this work, the correlation between reputation and JIF is assumed to be linear and positive [5, 6]. That is, the reputation of a journal is rescaled by the impact factor of the journal, and the reputation of a journal is higher if the impact factor of this journal is greater. Furthermore, each paper has an initial estimation quality affected by the author’s scientific level (rescaled by ). This score will determine how an author chooses a target journal. In principle, the professional scholars usually assess their work more accurately, and vice versa. In most cases, a paper with high quality also implicates the scientific level and innovation of a “competent” author behind it. Suppose that the author’s estimation quality () on the paper mainly depends on the academic level () of the author. The relation between the intrinsic quality of a paper and the author’s estimation quality on the paper could be depicted in the following form:where is the intrinsic quality of each paper, is the author’s estimation quality on the paper, and is the estimation noise by the author, which is a random value around 1. A paper with relatively higher tends to have that more approximates to 1. Thus, the variance of indicates the magnitude of noise and should be negatively correlated to the paper quality. In the parameter settings of the model, follows a truncated Gaussian distribution with expectation 1 and variance (so that only positive numbers can be sampled). Note that parameters and are positive numbers that jointly affect the aggregation or dispersion of the distribution curve.

Next, we set a condition that is corresponding to the psychological expectations of authors comprehensively considering adventurism and conservatism: a paper should be submitted to a journal with the absolute difference between and the average quality of the papers already published in the journal last year () being less than coefficient , namely, . Taken together, the authors would give priority to the journals that meet conditions and have higher impact factors when they are submitting the paper (see Part I of Figure 3).(3)Peer review process and publication:

Here, all journals are characterized by two state variables: a reputation value (rescaled by impact factors) and related rejection or acceptance thresholds. The journals first have an evaluation on each submitted paper, which can be regarded as a simplification form of the peer review [36]. In the current model, it is assumed that all reviewers in a discipline are peers, and therefore, the selection strategy of reviewers is a random selection process. The relation between the real quality of a paper () and a journal’s estimation on the paper () can be represented as follows:where denotes the real quality of a paper, denotes a journal’s assessment on the paper, and follows a lognormal distribution, i.e., .

Subsequently, according to the evaluation scores, the journals will rank the submitted papers in the descending order and make decision to accept or reject the papers based on their rankings. The rejection or acceptance thresholds are determined by the number of papers published per issue. In the model, each journal in each issue only accept (publish) the highest top 10 papers. Furthermore, only papers be accepted that, in each resubmission round, are in the top 10, and the rejected papers will be resubmitted to other journals in the next round or ultimately be abandoned and remain unpublished (shown in Part II of Figure 3). Figure 3 illustrates the modeling of the submission process, which mainly consists of two parts.

2.3. Modeling of Citation Process

Here, a virtual simulation model is developed to structure citation networks, in which papers are submitted, published, and cited in sequence. The papers are generated one by one in our program. Each time a new paper is increased, the set of papers is also correspondingly added by one item, which can be viewed as a new node in the virtual citation network. The number of citations a paper has already obtained is the degree of the corresponding node. The number of new edges that grow out of a newly added node in the citation network is the number of references of this new paper. Whether or not a new node should link to any existing node is assigned randomly, through probability . Following this way, a virtual citation network was gradually structured. Therefore, the key of calculation of the impact factor is how to assign the value to properly. In this work, enlightened by certain abstract simulation models on bibliometrics [12, 14, 36], assume that the probability of a paper to be cited in a specific discipline () is comprehensively affected by three factors.(1)The quality of a paper ():

As described in Section 2.2, according to the submission rules, the papers with different will be submitted and published in the corresponding journals. In this model, the correlation between and is supposed to be linear and positive, i.e., the corresponding effect function can be defined as follows:with , , and correspondingly, .(2)The paper age ():

For decades, the time dependence in the preferential attachment mechanism (PAM) [39] has always been a hot topic in citation networks. Younger papers will draw increasing attention via citations, while older papers are often overlooked by scholars. This aging effect is a universal phenomenon in growing networks. Furthermore, timeliness of research contents has a different effect on different disciplines. For instance, scholars in some experimental disciplines prefer to cite younger scientific achievements (for example, biology and AI), whereas in contrast, in certain theoretical disciplines (for example, Mathematics), those existing literature studies that have been fully validated are more likely to be cited [12, 15]. To this end, we consider employing a function to measure the relation between the effect to the probability of a paper to be cited by another paper in the same discipline and the paper age , and the function can be expressed as the following form:where is a hyperbolic tangent function; indicates the paper age, with its unit being months; and is a coefficient to the probability of the paper to be cited (see Appendix Figure 4).(3)The number of citations a paper receives ():

Humans are social animals, and therefore, our opinions or selections would be strongly influenced by our peers. This is particularly true in the citation network [34, 39]. Suppose that a weight factor to the probability of the paper to be cited is higher if the current number of citations (node of the citation network) of this paper is greater. The effect of the number of times a paper has already been cited to the probability of the paper to be cited in our model is also formulated by a function, which can be written as follows:with being a hyperbolic tangent function, being the number of citations already obtained, being the weight factor to the probability of the paper to be cited, and being the parameter changing the overall shape of the curve (shown in Appendix Figure 5).

Finally, according to the descriptions of the citation behaviors, the probability of a paper to be cited in the model can be comprehensively defined as a coefficient combination of a series of factors, which can be represented as the following form:

In Equation (8), denotes the real quality of a paper, denotes the paper age, denotes the number of citations a paper has, and is an i.i.d. zero-mean Gaussian noise. Besides, it should be explained that the probability to cite is the kernel of the program, but note that what really affects the citation result is merely the relative value of compared with other papers, rather than its absolute value. This plays a very important role in the model because it is easy to compute JIF when given in equation (8). Overall, the virtual scientific system is mainly composed of three sections, which are initial setup, submission, and citation (see Figure 6).

By the end of this section, one should notice that the most key feature of the analysis based on parallel systems is that a parallel model stands independently, each of the subsystems can be parameterized independently, and can itself be viewed as a feasible alternative of real systems. Such a parallel system is particularly suitable for solving situations with an ultracomplex mechanism or with unavailable data. Based on our knowledge from system analysis, either qualitative or quantitative, we expect to discover certain general laws behind various phenomena.

3. Computational Experiments and Data Analysis

3.1. Model Implementation

In this section, we perform simulations for the JIF dynamics model proposed in the last section. We first simulate and analyze the relation between the average quality of the journals () and their impact factors within 13 years. As stated in Introduction, the main objective of the simulation system is not for comprehensively and quantitatively mimicking the real-world scenarios. Thus, the model neglects those less important factors such as the very specific situation of journal disappearing or renaming. The comparison results of average quality and impact factors of journals are shown in Figure 7.

It can be seen from Figure 7 that the trends of color change are similar. In other words, the average quality of journals is consistent with the variations of the JIF. The journals with higher average quality (for example, journals 9–12 with ) always have greater impact factors. In contrast, the impact factors of the journals with lower average quality (for example, journals 1–4 with ) are unsatisfactory. The above simulation result is also demonstrated by the Gaussian fitting (bottom right in Figure 7). The explanation of this correlation is that the sample of papers with a higher average quality will tend to have a higher average number of citations in the model. Thus, the strength of this correlation between average quality and average citations will depend on the variation of the quality and citation distributions.

Next, we conduct experiments to verify whether or not JIF has an expected variation after it is artificially manipulated by the model. To see this, we artificially manipulate the impact factors of two different journals in a random year. The average performance in the 100 Monte Carlo simulation runs of a normal JIF and manipulated JIF is shown in Figure 8. Figures 8(a) and 8(e) are the impact factors without manipulation. Figures 8(b) and 8(f) are the impact factors with one manipulation in the 8th year. Figures 8(c), 8(d), 8(g), and 8(h) are the impact factors with two manipulations in the 5th and 10th years.

We can see from Figure 8 that the trends of the impact factors without manipulation do not change much within 13 years (Figures 8(a) and 8(e)), while the trends of the impact factors with manipulation are fluctuating strongly (Figures 8(b)8(h)). In addition, we note that when the impact factors are artificially decreased or increased to a certain level, they will maintain corresponding trends in the next few years until they are manipulated again. This is mainly due to the fact that the submissions only depend on the JIF of the last year in the model, whatever happens last year will immediately determine the outcome of the next year, and that will be conditionally independent of the situation two years (or more) before. Actually, it was already concluded in an influential study [9] that “…by manipulating JIF in different ways, their JIF will increase quickly, …, as an academic evaluation indicator, JIF is able to distinguish the differences of certain academic performances such as citation and publication process.”

3.2. Analysis of Features Correlating with Impact Factors

The task of this section is to analyze how the JIF changes under different parameters. The baseline values, variation intervals, and variation step of parameters in the model are reported in Table 1.


No.ParameterAverage gapAverage JIF gap
NotationDescriptionBaseline and variation intervalsStep of variation

1Horizontal translation of the curve in equation (5)40 [10, 100]100.4 8
2Overall shape of the curve in equation (5)10 [5, 20]50.1 1
3Overall shape of the curve in equation (6)7 [2, 10]10.2 1
4Average review time (months)10 [2, 20]20.3 6
5Average number of references30 [5, 60]50.3 6

Bold values indicate best performance.

From an analysis of Table 1, it can be found that all else being equal, parameter , average review time (), and average number of references () increased the impact factors’ gap the most in all tests. These results are confirmed by a probabilistic sensitivity analysis, which evaluated the sensitivity of and average JIF to simultaneous changes in multiple parameter values away from their baseline values (shown in Figure 9).

Additionally, we conducted 50 realizations at each of the 5 tested parameter values across a given range, and the univariate sensitivity analysis showed that the JIF in the current model is also most sensitive to parameters and . By comparing and analyzing the curves of Figure 10, under three different parameter settings (S1 (Figure 10(a)): , , and ; S2 (Figure 10(b)): , , and ; and S3 (Figure 10(c)): , , and ), it can be seen that the overall trends of impact factors are similar. In greater detail, the impact factor of the journal with the number of references, is much higher than the impact factor of other two types of journals with and . This result seems to show that the greater the average number of references () of a journal, the higher the impact factor of the journal would tend to be. Moreover, we also note that the overall trends of impact factors in three figures are correspondingly decreasing with the average review time () becoming longer.

Now, we analyze what happens when different parameters take values from small to large, that is, we would like to verify how the parameters such as and influence the dynamics of the model. To see this, we performed 100 simulation runs, each lasting 13 years (156 months) of simulated time. For each simulation run, five parameter values (, , , , and ) were sampled from each of the maximum, baseline, and minimum given in Table 1. The simulation results are summarized as follows: Figure 11 shows the variations of the JIF influenced by minimum, baseline, and maximum of parameters , , and when and in the 100 Monte Carlo simulations (Figures 11(a)11(c), respectively); Figure 12 shows the variations of the JIF influenced by minimum, baseline, and maximum of parameters , , and when and in the 100 Monte Carlo simulations (Figures 12(a)12(c), respectively); and Figure 13 shows the variations of the JIF influenced by minimum, baseline, and maximum of parameters , , and when and in the 100 Monte Carlo simulations (Figures 13(a)13(c), respectively). Table 2 gives the relation between the average JIF and different parameters in the 100 Monte Carlo simulations.


No.Average JIFMinimum, baseline, and maximum of parametersIllustration

114.468 (max)10 (min)5 (min)2 (min)2 (min)60 (max)Figure 11(a)
26.17540 (baseline)10 (baseline)7 (baseline)2 (min)60 (max)Figure 11(b)
32.333100 (max)20 (max)10 (max)2 (min)60 (max)Figure 11(c)

47.65510 (min)5 (min)2 (min)10 (baseline)30 (baseline)Figure 12(a)
54.224 (baseline)40 (baseline)10 (baseline)7 (baseline)10 (baseline)30 (baseline)Figure 12(b)
62.649100 (max)20 (max)10 (max)10 (baseline)30 (baseline)Figure 12(c)

71.94810 (min)5 (min)2 (min)20 (max)5 (min)Figure 13(a)
80.88140 (baseline)10 (baseline)7 (baseline)20 (max)5 (min)Figure 13(b)
90.382 (min)100 (max)20 (max)10 (max)20 (max)5 (min)Figure 13(c)

Bold values indicate best performance.

Through extensive simulations, under three different conditions (Figure 11: and ; Figure 12: and ; and Figure 13: and ), an important observation from the simulation results in Figures 1113 is that no matter what the conditions of parameters and are, the average JIF would increasingly grow with the decrease of parameters , , and . It is worth remarking that the values of and jointly reflect the citation lifetime of papers, where a steeper one, namely, lesser , indicates that the corresponding journals in some particular discipline tend to cite younger papers, whereas greater signifies longer citation life cycle of papers and wider range of aging distribution of references in that discipline. This can be understood from the hyperbolic tangent function in Appendix Figure 4 that when parameters and are very small, the probability of those papers with older age () to be cited by others is almost zero.

From Table 2 and Figures 913, we see that the value of the JIF is comprehensively influenced by different factors such as average review time (), average number of references (), and citation distribution ( and ). The maximum average JIF is obtained from the conditions as follows: ; ; ; ; and . The minimum average JIF is achieved by setting ; ; ; ; and . It is clear that the value of the JIF is positively correlated to the average number of references (), whereas it is negatively related to the other four parameters (i.e., , , , and ). The experimental results demonstrate that the younger of the reference (namely, the smaller the parameters and ), the higher the impact factor of the journal would tend to be. Similarly, a journal would achieve a higher impact factor if the journal has a greater average number of references (). Furthermore, a journal with a relatively shorter review time () tends to hold a higher impact factor and vice versa.

In order to further verify the experimental results, we select 120 journals in three different fields (biology, artificial intelligence, and mathematics) from the JCR database as a reference, which can be found in Appendix Figure 14 and Tables 35. These three different disciplines have been chosen because we can use them to represent the top, middle, and bottom of the overall impact factor level, respectively, and therefore, they should be better suited to showing general results. We recruited 20 graduate students with informatics background to record the data. Concretely speaking, we first randomly select 200 paper samples for each journal. Subsequently, according to the number of references, date of submission, and publication time displayed in each paper, the average number of references and the average review time of the corresponding journal can be easily computed. By comparing and analyzing the data from Tables 35, it can be found that there exists drastic difference in the levels of average JIF, average review time (T), and average number of references (N) over the three different disciplines. To be specific, we see from Appendix Figure 14 that the average JIF of biology is 2–5 times higher than AI and mathematics, particularly in Q1 and Q2 (biology: 9.59; AI: 3.89; and mathematics: 2.17). Interesting enough, the average number of references () of biology is also much higher than other two disciplines (biology: 46.5; AI: 43.25; and mathematics: 20.5). In addition, we note that the journals in biology have the shorter average review time () in comparison with AI and mathematics journals (biology: 6.3; AI: 11.6; and mathematics: 16.5). It is clear that these observations are basically consistent with our experimental results.


No.Journal titleQuantileAverage JIF (2009–2019)Average review time (months)Average number of references

1Nature Reviews GeneticsQ141.474.355
2Nature Reviews Molecular Cell BiologyQ135.614.652
3Nature Reviews MicrobiologyQ131.854.454
4CellQ131.403.962
5Nature GeneticsQ127.134.456
6Nature MethodsQ126.925.149
7Cell Stem CellQ123.294.652
8Cell MetabolismQ120.575.855
9Nature Cell BiologyQ119.064.158
10Trends in Cell BiologyQ118.564.061

11Current Opinion in Chemical BiologyQ27.575.752
12Current Opinion in Plant BiologyQ27.354.952
13Current Opinion in Structural BiologyQ27.186.253
14Redox BiologyQ27.135.844
15Molecular Ecology ResourcesQ27.065.650
16Annual Review of Animal BiosciencesQ26.787.146
17Cellular and Molecular Life SciencesQ26.727.042
18Current Opinion in MicrobiologyQ26.715.944
19Protein & CellQ26.237.246
20Critical Reviews in Plant SciencesQ26.616.644

21Current Opinion in Insect ScienceQ34.175.943
22Harmful AlgaeQ34.146.251
23Fungal Biology ReviewsQ33.977.139
24Journal of Biology RhythmsQ33.916.948
25Phytochemistry ReviewQ33.887.440
26Freshwater BiologyQ33.777.445
27Cell CalciumQ33.726.842
28GlycobiologyQ33.666.241
29Journal of Systematics and EvolutionQ33.667.540
30Frontier in ZoologyQ33.637.441

31Connective Tissue ResearchQ42.077.342
32Botanical ReviewQ42.507.839
33EXCLI JournalQ42.427.041
34Animal Health Research ReviewsQ42.417.345
35Probiotics and Antimicrobial ProteinsQ42.357.740
36Biological ResearchQ42.366.841
37Journal of MicrobiologyQ42.327.438
38Endangered Species ResearchQ42.318.139
39International Review of HydrobiologyQ42.287.841
40BryologistQ42.267.338


No.Journal titleQuantileAverage JIF (2009–2019)Average review time (months)Average number of references

1International Journal of Computer VisionQ111.549.845
2IEEE Transactions on CyberneticsQ110.3910.448
3IEEE Transactions on Pattern Analysis and Machine IntelligenceQ19.4510.649
4IEEE Transactions on Fuzzy SystemsQ18.4210.551
5IEEE Transactions on Evolutionary ComputationQ18.1211.149
6IEEE Transactions on Neural Networks and Learning SystemsQ17.9810.347
7Information FusionQ16.649.439
8IEEE Computational Intelligence MagazineQ16.618.939
9Neural NetworksQ15.229.241
10International Journal of Neural SystemsQ14.589.441

11IEEE Transactions on Image ProcessingQ25.079.244
12IEEE Transactions on Affective ComputingQ24.599.441
13Knowledge-Based SystemsQ24.409.242
14Neural Computing and ApplicationsQ24.2211.439
15Pattern RecognitionQ23.9613.146
16Swam and Evolutionary ComputationQ23.829.637
17NeurocomputingQ23.248.942
18Artificial IntelligenceQ23.0313.255
19Artificial Intelligence ReviewQ23.8111.861
20Expert Systems with ApplicationsQ23.7710.846

21Advanced Engineering InformaticsQ33.3610.241
22Artificial Intelligence in MedicineQ32.8813.244
23International Journal of Machine Learning and CyberneticsQ32.6912.846
24Frontiers in NeuroroboticsQ32.619.839
25IEEE Transactions on Human-Machine SystemsQ32.5611.848
26Computer Vision and Image UnderstandingQ32.3913.242
27Soft ComputingQ32.3714.239
28SIAM Journal on Imaging SciencesQ32.3611.840
29International Journal of Bio-Inspired ComputationQ32.2714.146
30Knowledge and Information SystemsQ32.2512.041

31International Journal of Computational Intelligence SystemsQ42.012.243
32IET BiometricsQ41.8413.538
33Genetic Programming and Evolvable MachinesQ41.4612.538
34Expert SystemsQ41.4313.346
35IET Image ProcessingQ41.4014.939
36Machine Vision and ApplicationsQ41.3112.842
37Pattern Analysis and ApplicationsQ41.2814.238
38Journal of Intelligence Information SystemsQ41.1113.536
39IET Computer VisionQ41.0914.440
40AI MagazineQ41.0313.342


No.Journal titleQuantileAverage JIF (2009–2019)Average review time (months)Average number of references

1SIAM ReviewQ14.8914.518
2Annals of MathematicsQ14.7715.822
3Advances in Nonlinear AnalysisQ14.6716.520
4Journal of The American Mathematical SocietyQ14.6316.922
5Multivariate Behavioral ResearchQ13.6914.824
6Communications on Pure and Applied MathematicsQ13.3914.819
7Mathematical Models and Methods in Applied SciencesQ13.3218.226
8Annual Review of Statistics and its ApplicationQ13.2916.521
9Foundations of Computational MathematicsQ13.0614.818
10Risk AnalysisQ12.916.819

11American StatisticianQ22.3514.423
12Bayesian AnalysisQ22.3416.022
13Applied Mathematics and ComputationQ22.3015.418
14Journal of The American Statistical AssociationQ22.3011.926
15Stata JournalQ22.1613.321
16Communications in Applied Mathematics and Computational ScienceQ22.1318.223
17Annals of Applied ProbabilityQ22.1218.823
18Journal of Nonlinear ScienceQ22.1115.627
19PsychometrikaQ22.0913.922
20International Statistical ReviewQ22.0516.222

21Memoirs of the American Mathematical SocietyQ31.7414.417
22Journal of Fourier Analysis and ApplicationsQ31.5314.621
23Inverse Problems and ImagingQ31.4713.311
24Statistical ModelingQ31.4315.919
25CombinatoricaQ31.4120.021
26Advances in Differential EquationsQ31.4017.223
27Analysis and Mathematical PhysicsQ31.3814.920
28American Journal of MathematicsQ31.3815.526
29IMA Journal of Applied MathematicsQ31.3718.819
30Russian Mathematical SurveysQ31.3616.019

31Mathematics and Financial EconomicsQ41.0817.521
32Advances in Difference EquationsQ41.0716.814
33Annals of Pure and Applied LogicQ41.0019.418
34Journal of Fixed Point Theory and ApplicationsQ40.9718.218
35Journal of Numerical MathematicsQ40.9515.023
36MetrikaQ40.9519.616
37Statistical Methods and ApplicationsQ40.9317.721
38Izvestiya: MathematicsQ40.9221.017
39PositivityQ40.9218.519
40Journal of Theoretical ProbabilityQ40.8819.422

4. Conclusion

As a quantitative evaluation indicator of journal quality, JIF plays a very important role in the process of the dissemination and development of academic research. However, JIF is not only widely used but also misused, producing skewed and misleading results. For example, JIF is misused to assess individual papers, authors, publishers, and institutions. Clearly, it is of great theoretical and practical significance to analyze and optimize the mechanism and effectiveness of existing evaluation indicators. This paper developed a simulation model based on the mechanism of submission, review, and citation of papers, which can be used to reproduce the differentiation process of impact factors of different journals within a similar discipline. The study is dedicated to providing a novel experimental approach based on social parallel systems, which can structure virtual citation networks for a specific discipline. Relevant series of studies can provide enlightening and helpful hints for facilitating and managing academic journals in the future. It is worth noting that intuition and speculation taken for granted are often unreliable in science, until they are validated by scientific evidence and statistical inference. Objectively speaking, without the method and the assistance of simulation experiments here, it might be very difficult to present the experimental evidence that implies the statistical correlations between the JIF dynamics and certain behaviors and elements in publication. The simulation results demonstrate that the behaviors of submission and citation would be influenced and driven by the JIF. It is the interplay between the submission and citation that further clarifies the mechanism of JIF dynamics. From an analysis of the experimental and statistical results, it can be found that the impact factor of a journal is affected by many variables and latent factors, including the discipline field of the journal, the average number of references per journal, and the peer review time of the journal. These factors can be mainly summarized as three aspects. (1) Discipline difference: experimental results in some subject fields require relatively more time to mature due to the delay in verification and recognition. The difference is so significant that the bottom journal in one discipline may have an impact factor higher than the top journal in another field, for example, the average JIF of biology in Q3 is 2-3 times higher than AI and mathematics in Q2. (2) The number of references: in principle, JIF is computed via the count of references; therefore, it is possible to be increased by adding the number of references in a reasonable and scientific manner. Moreover, the number and the aging distribution of references in a discipline field not only reflect the timeliness and characteristic of research achievements but also have a cumulative effect on changing the overall level of impact factors. (3) Peer review time: the experimental results demonstrated that journals with short peer review time tend to obtain higher impact factors. Thus, the impact factor of a journal can be increased by scientific peer review and effective journal management.

In the future, a series of meaningful work can be conducted subsequently; an interesting study on how the impact factor is differentiated within a similar discipline can be carried out by extending the current model. In addition to simplistic algorithms such as JIF, the advantages of various methods and the factors that can be weighted should be taken into account. It also should be a meaningful future direction to refine the configuration of the review process. For instance, what is the relationship between the reviewers and the subject of the paper and whether or not decision trees could be used to make selection or decision, as well, what is the correlation between the impact factor of a journal and the publication cycle of the journal. Furthermore, certain phenomena observed in experiments could possibly be explained in analytical or statistical manners, combining theories and approaches in scientometrics and machine learning. One may consider utilizing the technologies of PCA (principal component analysis) and Laplacian score for better analyzing the distribution and the stability of the JIF between different disciplines.

Appendix

A. Description of Functions and

It can be observed in Figure 4(b) that is the parameter shaping the horizontal translation of the curve, and is the parameter changing the overall shape of the curve. Mathematically, function should satisfy the following principles:(1)(2)The function is increasing in the interval

Function follows several qualitative principles:(1)The function is increasing in the interval (2)The slope of the function is always decreasing in the interval (3)

The greater the parameter and the less the steep the curve in Figure 5(b), a paper is more likely to be cited by other authors. Moreover, it is clear that the citation network is among the most typical scale-free networks. In comparison with the basic structure of the BA scale-free network, the only significant difference is that the function here is nonlinear, while the function described in [39] is simpler and linear. It can be seen from Figure 5(b) that the curves approximate linear in certain intervals, in particular when the node degree is very low. Therefore, the citation network structured here is more general.

In China, the journal rankings of Chinese Academy of Sciences (https://www.las.ac.cn/) are also based on JIF data from JCR. The impact factor of all discipline journals can be divided into quantiles (Q1–Q4), which are presented in a pyramid shape. The first quantile is composed of the top 5%, while the second quantile 6%–20%, the third quantile 21%–50%, and the fourth quantile contains the remaining. The sample contains 3 classes of 40 instances each, where each class refers to the discipline of journals (biology, AI, and mathematics, respectively), and 40 instances consist of the top 10 journals in each quantile of the discipline. The reason for doing this is that, usually, the top 10 journals in different quantiles are better representative of the citation characteristics and research foci of the corresponding discipline. The sources of sample data obtained in the work are the most popular multidisciplinary databases: Web of Science (WoS), Scopus, and Google Scholar. In addition to these, MedSci (https://www.medsciediting.com/), LetPub (https://www.letpub.com.cn/), and IEEE Xplore Digital Library (https://ieeexplore.ieee.org/) provide assistance for collecting the bibliometric indicators of some journals. For better comparison, we use the most recent data available, i.e., the data we obtained in the first week of August 2019. Figure 14 and Tables 35 show the comparison of average JIF, average review time (), and average number of references () in Q1–Q4 for the 120 journals belonging to biology, AI, and mathematics.

Data Availability

The program code will be made available on https://github.com/pjzj/JIF-Modeling.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by Fundamental Research Funds for the Central Universities (Grants 2019RC29 and DUT19RC(3)012), the National Natural Science Foundation of China (NNSF) (Grants 61672130 and 61972064), the Gansu Provincial First-Class Discipline Program of Northwest Minzu University (Grant 11080305), and LiaoNing Revitalization Talents Program (Grant XLYC1806006).

References

  1. L. Cai, J. Tian, J. Liu et al., “Scholarly impact assessment: a survey of citation weighting solutions,” Scientometrics, vol. 118, no. 2, pp. 453–478, 2019. View at: Publisher Site | Google Scholar
  2. L. Bornmann and W. Marx, “The journal impact factor and alternative metrics,” EMBO Reports, vol. 17, no. 8, pp. 1094–1097, 2016. View at: Publisher Site | Google Scholar
  3. L. Feng, J. Zhou, S.-L. Liu, N. Cai, and J. Yang, “Analysis of journal evaluation indicators: an experimental study based on unsupervised Laplacian score,” Scientometrics, vol. 124, no. 4, pp. 233–254, 2020. View at: Publisher Site | Google Scholar
  4. E. Garfield, “The history and meaning of the journal impact factor,” JAMA, vol. 295, no. 1, pp. 90–93, 2006. View at: Publisher Site | Google Scholar
  5. E. Garfield, “Fortnightly review: how can impact factors be improved?” BMJ, vol. 313, no. 7054, pp. 411–413, 1996. View at: Publisher Site | Google Scholar
  6. V. Larivière and Y. Gingras, “The impact factor’s Matthew effect: a natural experiment in bibliometrics,” Journal of the American Society for Information Science and Technology, vol. 61, no. 2, pp. 424–427, 2010. View at: Publisher Site | Google Scholar
  7. E. Garfield, “Citation indexes for science: a new dimension in documentation through association of ideas,” International Journal of Epidemiology, vol. 35, no. 5, pp. 1123–1127, 2006. View at: Publisher Site | Google Scholar
  8. M. Bordons, M. T. Fernández, and I. Gómez, “Advantages and limitations in the use of impact factor measures for the assessment of research performance,” Scientometrics, vol. 53, no. 2, pp. 195–206, 2002. View at: Publisher Site | Google Scholar
  9. U. Finardi, “Correlation between journal impact factor and citation performance: an experimental study,” Journal of Informetrics, vol. 7, no. 2, pp. 357–370, 2013. View at: Publisher Site | Google Scholar
  10. E. Garfield, “Citation indexes for science: a new dimension in documentation through association of ideas,” Science, vol. 122, no. 3159, pp. 108–111, 1955. View at: Publisher Site | Google Scholar
  11. L. Bornmann, W. Marx, A. Y. Gasparyan, and G. D. Kitas, “Diversity, value and limitations of the journal impact factor and alternative metrics,” Rheumatology International, vol. 32, no. 7, pp. 1861–1867, 2012. View at: Publisher Site | Google Scholar
  12. J. Zhou, N. Cai, Z.-Y. Tan, and M. J. Khan, “Analysis of effects to journal impact factors based on citation networks generated via social computing,” IEEE Access, vol. 7, pp. 19775–19781, 2019. View at: Publisher Site | Google Scholar
  13. M. J. Lovaglia, “Predicting citations to journal articles: the ideal number of references,” The American Sociologist, vol. 22, no. 1, pp. 49–64, 1991. View at: Publisher Site | Google Scholar
  14. Z.-Y. Tan, N. Cai, J. Zhou, and S.-G. Zhang, “On performance of peer review for academic journals: analysis based on distributed parallel system,” IEEE Access, vol. 7, pp. 19024–19032, 2019. View at: Publisher Site | Google Scholar
  15. Z. Zhang and S. V. Poucke, “Citations for randomized controlled trials in sepsis literature: the halo effect caused by journal impact factor,” PLoS One, vol. 12, no. 1, Article ID e0169398, 2017. View at: Publisher Site | Google Scholar
  16. G. Yu, X.-H. Wang, and D.-R. Yu, “The influence of publication delays on impact factors,” Scientometrics, vol. 64, no. 2, pp. 235–246, 2005. View at: Publisher Site | Google Scholar
  17. P. P. Maglio and P. L. Mabry, “Agent-based models and systems science approaches to public health,” American Journal of Preventive Medicine, vol. 40, no. 3, pp. 392–394, 2011. View at: Publisher Site | Google Scholar
  18. J. M. Epstein, Generative Social Science: Studies in Agent-Based Computational Modeling, Princeton University Press, Princeton, NJ, USA, 2006.
  19. J. D. Farmer and D. Foley, “The economy needs agent-based modelling,” Nature, vol. 460, no. 7256, pp. 685-686, 2009. View at: Publisher Site | Google Scholar
  20. W. Glänzel and U. Schoepflin, “A stochastic model for the ageing of scientific literature,” Scientometrics, vol. 30, no. 1, pp. 49–64, 1994. View at: Publisher Site | Google Scholar
  21. W. Glänzel, Bibliometrics as a Research Field: A Course on Theory and Application of Bibliometric Indicators, 2003, Course Handouts.
  22. C. Cioffi-Revilla, Introduction to Computational Social Science: Principles and Applications, Springer, Cham, Switzerland, 2014.
  23. F.-Y. Wang, “Study on cyber-enabled social movement organizations based on social computing and parallel systems,” Journal of University of Shanghai for Science and Technology, vol. 33, no. 1, pp. 8–17, 2011. View at: Google Scholar
  24. F.-Y. Wang, “Parallel control and management for intelligent transportation systems: Concepts, architectures, and applications,” IEEE Transactions on Intelligent Transportation Systems, vol. 11, no. 3, pp. 630–638, 2010. View at: Publisher Site | Google Scholar
  25. X. Wang, X. Zheng, X. Zhang, K. Zeng, and F.-Y. Wang, “Analysis of cyber interactive behaviors using artificial community and computational experiments,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 47, no. 6, pp. 995–1006, 2017. View at: Publisher Site | Google Scholar
  26. F.-Y. Wang, “Toward a paradigm shift in social computing: the ACP approach,” IEEE Intelligent Systems, vol. 22, no. 5, pp. 65–67, 2007. View at: Publisher Site | Google Scholar
  27. L.-X. Wang, “Dynamical models of stock prices based on technical trading rules—part I: the models,” IEEE Transactions on Fuzzy Systems, vol. 23, no. 4, pp. 787–801, 2015. View at: Publisher Site | Google Scholar
  28. J. Xi, M. He, H. Liu, and J. Zheng, “Admissible output consensualization control for singular multi-agent systems with time delays,” Journal of the Franklin Institute, vol. 353, no. 16, pp. 4074–4090, 2016. View at: Publisher Site | Google Scholar
  29. Z. Ji, H. Lin, S. Cao, Q. Qi, and H. Ma, “The complexity in complete graphic characterizations of multiagent controllability,” IEEE Transactions on Cybernetics, pp. 1–13, 2020. View at: Publisher Site | Google Scholar
  30. S. Liu, Z. Ji, and H. Ma, “Jordan form-based algebraic conditions for controllability of multiagent systems under directed graphs,” Complexity, vol. 2020, Article ID 7685460, 18 pages, 2020. View at: Publisher Site | Google Scholar
  31. L. Mo and S. Guo, “Consensus of linear multi-agent systems with persistent disturbances via distributed output feedback,” Journal of Systems Science and Complexity, vol. 32, no. 3, pp. 835–845, 2019. View at: Publisher Site | Google Scholar
  32. H.-Y. Ma, X. Jia, N. Cai, and J.-X. Xi, “Adaptive guaranteed-performance consensus control for multiagent systems with an adjustable convergence speed,” Discrete Dynamics in Nature and Society, vol. 2019, Article ID 5190301, 9 pages, 2019. View at: Publisher Site | Google Scholar
  33. F.-Y. Wang, “Back to the future: surrogates, mirror worlds, and parallel universes,” IEEE Intelligent Systems, vol. 26, no. 1, pp. 2–4, 2011. View at: Publisher Site | Google Scholar
  34. M. Rosvall and C. T. Bergstrom, “Maps of random walks on complex networks reveal community structure,” Proceedings of the National Academy of Sciences, vol. 105, no. 4, pp. 1118–1123, 2008. View at: Publisher Site | Google Scholar
  35. N. Cai, C. Diao, and M. J. Khan, “A novel clustering method based on quasi-consensus motions of dynamical multiagent systems,” Complexity, vol. 2017, Article ID 4978613, 8 pages, 2017. View at: Publisher Site | Google Scholar
  36. M. Kovanis, R. Porcher, P. Ravaud, and L. Trinquart, “Complex systems approach to scientific publication and peer-review system: development of an agent-based model calibrated with empirical journal data,” Scientometrics, vol. 106, no. 2, pp. 695–715, 2016. View at: Publisher Site | Google Scholar
  37. H.-H. Hu, J. Lin, and W. Cui, “Cultural differences and collective action: a social network perspective,” Complexity, vol. 20, no. 4, pp. 68–77, 2015. View at: Publisher Site | Google Scholar
  38. G. Ionescu and B. Chopard, “An agent-based model for the bibliometric h-index,” The European Physical Journal B, vol. 86, no. 10, p. 426, 2013. View at: Publisher Site | Google Scholar
  39. A.-L. Barabási and R. Albert, “Emergence of scaling in random networks,” Science, vol. 286, no. 5439, pp. 509–512, 1999. View at: Publisher Site | Google Scholar
  40. L. Bornmann and R. Williams, “Can the journal impact factor be used as a criterion for the selection of junior researchers? a large-scale empirical study based on researcherid data,” Journal of Informetrics, vol. 11, no. 3, pp. 788–799, 2017. View at: Publisher Site | Google Scholar

Copyright © 2020 Jian Zhou et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder
Views577
Downloads290
Citations

Related articles

Article of the Year Award: Outstanding research contributions of 2020, as selected by our Chief Editors. Read the winning articles.