Abstract

Nation-state cyberattacks, and particularly Advanced Persistent Threats (APTs), have rocketed in the last years. Their use may be aligned with nation-state geopolitical and economic (GPE) interests, which are key for the underlying international relations (IRs). However, the interdependency between APTs and GPE (and thus IRs) has not been characterized yet and it could be a steppingstone for an enhanced cyberthreat intelligence (CTI). To address this limitation, a set of analytic models are proposed in this work. They are built considering 234M geopolitical events and 306 malicious software tools linked to 13 groups of 7 countries between 2000 and 2019. Models show a substantial support for launched and received cyberattacks considering GPE factors in most countries. Moreover, strategic issues are the key motivator when launching APTs. Therefore, from the CTI perspective, our results show that there is a likely cause-effect relationship between IRs (particularly GPE relevant indicators) and APTs.

1. Introduction

Cyberthreats have been on the rise in the last years, with cyberthreat intelligence (CTI) being a key subject to mitigate damage in the cyberspace. According to the latest EUROPOL’s Internet Organized Crime Threat Assessment, cybercriminals have evolved their modus operandi to improve their success rate [1]. As such, the World Economic Forum has identified cyberattacks as the greatest nonenvironmental threat to humanity [2].

Beyond traditional malwares (e.g., ransomware, trojans, etc.), a particular set of advanced threats are also increasing: Advanced Persistent Threats (APTs). APTs are typically carried out by powerful actors which count on substantial resources to build a long-lasting malware [3]. Although the attribution is typically cumbersome, it is generally accepted that most of the APTs are state-sponsored. For example, CozyDuke APT is allegedly linked to the Russian-based APT29 group [4]. As opposed to regular malwares, APTs are usually focused on stealing information or compromising devices. They have already been applied against other countries or opponents, such as the case of Chinese APTs against Tibetan organizations [5].

The relationship between targeted cyberattacks and international relations (IRs) has already been pointed out. From a CTI perspective, it is quite useful for a better understanding of a particular incident. Particularly, the influence of geopolitical and economic issues (hereinafter, GPE) has been identified in concrete events [6, 7]. These cyberattacks may be human- or computer-focused. As an example of the first case, the recent COVID-19 pandemic has led to a substantial amount of disinformation campaigns [8]. However, computer-focused attacks have been at stake for a longer period and thus they are at the core of this paper. For example, a large-scale distributed denial of service attack was launched by Russia over Estonia because of the latter moving a Soviet-era statue (Geers [9]). Overall, cyberattacks tied to cyberwars, or geopolitical conflicts, increased from 19% in 2018 to 27% in 2019 [10]. This has also led to some political agreements on the use of cyberspace. For example, China and Russia signed in 2015 an agreement on “cooperation in ensuring international information security” [11]. Despite the agreement, Russian-related APTs have been launched against China after that date.

The implications of the use of cyberspace to impact other countries have already been highlighted, even from the main actors. In this regard, China and Russia asked for an “international code of conduct for information security” back in 2011 [12]. In the same line, China stated in 2017 that “no country should pursue cyberhegemony, interfere in other countries’ internal affairs, or engage in, condone, or support cyberactivities that undermine other countries’ national security.” Despite these political statements, both China and Russia have been linked to a vast number of APTs against other countries. This trend has been followed by several other nations around the world. According to FireEye, countries such as Iran, Vietnam, or North Korea are among the most prominent ones [13]. Indeed, public attribution of cyberattacks has also been studied considering its political implications [14]. This particular feature calls for a potential mutual influence of IR (particularly GPE issues) and nation-state cyberattacks (APTs), which has been long studied. From a broader perspective, geopolitics has already been pointed out as an influencer for cyberattacks [15, 16]. With a closer focus, socioeconomic, psychosocial, and geopolitical factors of cybercrime are analysed in [17], being particularized in Nigeria. However, to the best of the authors’ knowledge, this influence has not been empirically measured. Indeed, this problem cannot be addressed from the computer science or the IRs perspectives alone; an interdisciplinary approach is needed.

To overcome this limitation, in this paper, we aim to build a set of analytical models to determine the strength of the relationship between APTs and GPE matters, thus shedding light on a CTI process. For the sake of relevance, the models will be applied considering 13 of the most active APT groups according to the Thales-Verint index [18] and FireEye [13]. This results in 7 attacker countries and 6 victim ones.

This paper tackles two research questions, leading to the following contributions.

RQ1. Are there (possibly causal) relationships between GPE issues and APTs worldwide? Do such relationships hold for a given region or country?(i)We provide a mathematical characterization of the relevance of this relationship.(ii)We analyse this matter for attacks carried out and received by the United States, Russia, China, Iran, India, Vietnam, and North Korea, as they are linked to the most relevant APT groups worldwide.

RQ2. Which are the underlying motivations for each attacking country?(i)We analyse the individual relevance of three GPE factors, namely, economical, strategical, and warfare motivators on launching APT-based cyberattacks. This allows characterizing the alignment of APTs with the national strategy of the attacking country, which has been pointed out as an open research issue [19].

This paper is structured as follows. Section 2 analyses related works. Afterwards, Section 3 introduces the background and describes the applied methodology. Section 4 presents results. Lastly, Section 5 concludes the paper and points out future research directions.

In the last 10 years, in the CTI context, many efforts have been made to analyse APTs. From a technical perspective, MITRE corporation has developed MITRE ATT&CK, a repository of attacks and techniques [20]. In this project, groups of attacks are linked to APTs and their purported origins, leading to MITRE Groups catalogue. At academic level, [21, 22] studied multiple APTs in terms of their deployment and evolution, from the initial system compromise to its control. By contrast, [23] analysed some common attack methods and tools used by APTs, while [24] studied behaviours of multiple APTs and their protection measures. Reference [25] presented a deeper analysis, identifying APTs in which actors, type, and content can be deduced. Moreover, [26] developed a survey on APTs, presenting a systematic review of their methods and techniques, as well as methods for their detection.

From a sociopolitical perspective, several years ago, in 1998, [27] searched for a cause-and-effect model of attacks on information systems, called cyberattacks nowadays. Later, [28] presented a theoretical study of a subset of cyberattacks, from 1995 to 2009, with political, sociocultural, and economic motivation. Although they are not related to APTs, it is pointed out that cyberattacks are strongly correlated to political and cultural conflicts. Similarly, but without a clear link to cyberattacks, [29] presented a theoretical discussion towards political, technological, and scientific factors in terms of cybersecurity politics. Moreover, [30] considered cyberattacks as social events associated with social, political, economic, and cultural (SPEC) factors to understand the motivations behind them. In particular, the correlation of variables and network analysis is used to assess the relevance of factors such as corruption and the income difference. Just in the social dimension, [31] analysed cyberattacks to build a threat model based on past and current social events through a Formal Concept Analysis (FCA) approach and a Fact Proposition Space (FPS) inference technique. Knowledge is acquired from news articles and the evaluation is carried out over 14 news articles linked to some cyberattacks from 1995 to 2010.

On the other hand, without mentioning APTs, but using the term state-sponsored cyberattacks, [32] analysed incidents of such attacks regarding intra- and interindustry trade. The evaluation of the proposal involves variables such as cyberespionage campaigns, information about trade data, GDP per capita, or conflict data. In a more recent approach, [33] presented a GPE analysis to cover which countries strategic motivations are in line with the observed attacker activity from an APT attribution perspective. Who benefits from the attacks is discussed, pointing out political and economic interests but in a general way and without focus on APTs. Last but not least, [34] used event data and a proprietary cyberincident dataset to investigate what happens between countries when cyberconflict is used in foreign policy interactions. It is found that only distributed denial of service attacks affect relationships between states, as well as the change of political behaviour and policies.

Table 1 presents an analysis of existing CTI approaches related to the presented proposal. It points out if they deal with APTs; if they handle, discuss, or analyse GPE factors; if they address any of our proposed research questions; and, finally, the applied methodology and dataset. In light of existing studies, some of them focus on APTs and some other on social or sociopolitical matters related to cyberattacks, but no proposal has modelled and analysed relationships between APTs and GPE concerns. Moreover, in terms of methodology, [32] is the only proposal that applies regression models as in our proposal (introduced later in Section 3.2). However, their models are different as they are used for different purposes. Finally, considering datasets, most of them focus on cyberattacks in general, not in APTs. Just [32, 34] used a dataset involving some APT but their number is quite limited. As a matter of fact, most of their cyberattacks are already included in our study (see Section 3.2.1 for details on our dataset). Moreover, they do not include information of victims or attacked sectors, which are essential to address our research questions.

3. Materials and Methods

3.1. Background

In this section, three basic notions for this proposal are introduced. In particular, the notion of APT is introduced in Section 3.1.1. Afterwards, the Goldstein scale is presented in Section 3.1.2 to rate sociopolitical events. Lastly, linear models required to build the analytical model are described in Section 3.1.3.

3.1.1. APT Concepts

An APT is a sophisticated long-term attack launched against a specific targeted entity [35]. Although attribution is not straightforward, researchers agree that these types of attacks are usually coordinated by highly specialized and skilled teams, usually funded by (or linked to) governments or nation states (hereafter referred to as APT groups) [36]. Each APT group materialises its cyberattacks in the form of campaigns, and each campaign has a set of technical indicators associated with it, such as start and end dates, Software Tools (STs), and victims. In this paper, the amount of cyberattacks (sent or received) has been measured by the number of STs in use per year. For example, the Chinese APT group called APT10 developed the “menuPass” campaign with 3 used STs in 2016, namely, ChChes, PlugX, and Poison Ivy [37]. We adopt this indicator as it is clearly stated in all considered reports. Indeed, although the number of victims could also be taken into account, some of them could not be known and this would have a negative impact on the robustness of the data at stake.

3.1.2. Rating Geopolitical Events: The Goldstein Scale

Conflict and Mediation Event Observations (CAMEO) is a taxonomy for coding event data [38]. It was developed to correct some of the problems in the WEIS (World Event Interaction Survey) and the COPDAB (Conflict and Peace Data Bank) coding systems [38]. For each event, an indicator of its intensity is given following the Goldstein scale. It assigns a numerical score between −10 (the most conflictual event) and +10 (the most cooperative one), capturing the theoretical potential impact that type of event will have on the stability of a country.

3.1.3. Linear Models

To analyse the relationship between GPE issues and APTs, multiple linear regression models [39] are used. In a nutshell, in these models, the predicted scalar magnitude Y is assumed to depend on several explanatory variables xi (see Equation (1)). This dependence is assumed to be linear and the weight for each explanatory variable is estimated from the data. This procedure will allow us to understand how the variation in the predicted variable is related to the variation in the explanatory variables. As this does not usually lead to a perfect fit, a negligible factor is typically needed. As usual, the explanatory power will be characterized by the adjusted R2 coefficient (in the range [−1, 1]) which is the amount of variation explained.

3.2. Methodology

The proposed research questions are answered based on a methodology composed of the steps highlighted in grey in Figure 1. Data is collected in first place (Section 3.2.1), identifying cyberattacks (Section 3.2.1(1)), and GPE factors (Section 3.2.1(2)), to generate models afterwards (Section 3.2.2). Moreover, for consistency purposes and to ensure the validity of the models, the alignment between attacked sectors and cyberattack motivations is also analysed (Section 4.2).

3.2.1. Source Data Collection

Data is collected for all studied countries and distinguishing, when required, between attackers and victims. The following sections describe the nature of the data used in the models’ construction. To foster further research in this area, our dataset has been publicly released in GitHub (https://github.com/crramosi/APTs-Dataset).

(1) Cyberattacks. This research is based on 13 of the most relevant APT groups attributed to 7 different countries according to the Cyberthreat Handbook by Thales-Verint [18] and FireEye [13] (see Table 2). Our selection promotes that significant APT groups are considered and that regional diversity is preserved. In total, 439 different reports, publications, and blog entries have been studied, which describe 306 STs. All sources are public and freely accessible, including cybersecurity firms and vendors such as Kaspersky [53], the United States (US) Cybersecurity and Infrastructure Security Agency (CISA) [54], collaborative platforms such as Malpedia [55], and cybersecurity blogs such as Security Affairs [56].

The process of collecting cyberattacks was carried out in line with [57] to generate a reliable and quality dataset; cyberattacks were collected from the relevant set of sources cited beforehand. At the beginning, any cyberattack that could be considered an APT attack was collected, whether it met the exact definition or not. Once all cyberattacks were collected, it was decided whether they met the APT definition by a text search for keywords such as group name and aliases. Moreover, a test-retest method has been applied in this process; all data were initially encoded according to a coding manual (available in GitHub repository), and this process was repeated some months later to ensure the reliability and quality of the data at stake.

Most of the studied STs are from North Korea (136), followed by Russia (48), China (37), Iran (36), Vietnam (31), India (11), and USA (7). It must be noted that one APT group called Dark Basin has not created any ST according to existing reports due to its novelty. However, the considered reports describe recent cyberattacks against different victims and sectors. Thus, even if there is no mention to the associated STs, this group is kept for the sake of completeness.

Gathering APT groups based on the presumed origin country, we studied cyberattacks either as attacker or as victim in China (CHN), India (IND), Iran (IRN), North Korea (PRK), Russia (RUS), United States (USA), and Vietnam (VNM). Considering the selected reports, technical data on their campaigns have been obtained for each group, including (when possible) start and end dates, used STs and victim sectors, and countries (available in GitHub repository). For illustrative purposes, Table 3 presents a summary of the number of uses of STs that each country has made (as attacker) or suffered (as victim). It must be noted that each ST may be used several times and that a given country may use STs from another one. Thus, the amount of STs created (Table 2) and that of ST uses (Table 3) do not necessarily match.

The collected data shows that RUS and PRK are the most active countries and that USA is by far the most targeted country, with more than 180 cases. No data is known for PRK as victim, as it has not been publicly disclosed.

(2) GPE Factors. We differentiate three main factors within GPE issues, namely, strategic/diplomatic, economic, and warfare. Considering the influence of geopolitical and economic issues in cyberattacks (recall Section 1), although the potential motivation for a cyberattack may be diverse, it has been pointed out that GPE factors are the usual ones [58]. Indeed, as pointed out in Section 2, several works deal with them. Concerning the first type, conflicts and agreements between countries are retrieved using the GDELT database. GDELT is a free, global, open-source project that monitors radio, press, and web news from around the world in real time and converts them into a common format for open research, thus breaking down language and access barriers and becoming a valuable data source [59]. In particular, the GDELT Event Database collects daily the physical activities (or events) described in the news. In addition, it uses the CAMEO event taxonomy in its latest version (recall Section 3.1.2), capturing two actors and the action (event) performed by Actor1 upon Actor2. It offers a wide range of features including the Goldstein scale and number of mentions, that is, the total number of citations of each event across all source documents. Relying upon GDELT is beneficial as it gathers the information surrounding political conflicts in a continuous manner, so we do not only consider discrete situations which could be scarce. In total, 234,080,914 events were studied related to the period between 2000 and 2019 (see Table 4).

To measure the relevance of each event, the Goldstein score (recall Section 3.1.2) is used as an approximation of the impact of that event. With this scale, it is possible to define whether relations between countries are bad (negative values) or good (positive values). To get a precise measurement, it must be noted that each event in GDELT can have one or more appearances (subEvents). Each subEvent has also a number of mentions, which reflect their relevance in terms of media coverage. Thus, two strategic or diplomatic variables have been created, PositiveValue and NegativeValue, calculated per year as the sum of all events as follows:

For each studied year, these formulas classify conflicts (NegativeValue) as events with scores on the Goldstein scale between [−10, 0) and agreements (PositiveValue) as events with scores on the Goldstein scale between [0, +10]. In addition, they multiply events by their average number of mentions (MeanMentions) as a method of assessing the importance of the event. Thus, the combination of the amount of appearances, their media relevance, and the event nature measures the significance of each event for the relationship between a pair of countries.

With respect to economic motivations, we consider data provided by the World Development Indicators database [60], the United Nations Statistics Division [61], and the International Monetary Fund [62]. In particular, four indicators are considered, namely, the Human Development Index (HDI), the Gross Domestic Product Per Capita (GDP_PC), the amount of exports and imports (ExportsImports), and the foreign direct investment (ForeignDirectInvestmentNetInflows). They collectively provide a simplified vision of the status of a country from a macroeconomic perspective. The latter refers to the sum of equity capital, reinvestment of earnings, other long-term capital, and short-term capital and measures the interest of third parties into a given country. It must be noted that not all indicators are provided on a yearly basis. Thus, GDP_PC, ExportsImports, and ForeignDirectInvestmentNetInflows range from 2000 to 2010 in five-year jumps and from 2010 to 2019 in annual jumps. To manage this issue, the five-year gaps are filled with progressive values (e.g., if GDP_PC is 1,000 in the year 2000 and 2,000 in 2005, 2006 is assumed to be 1,200, 2007 would be 1,400, and so on) and the annual gaps are filled with the average of the adjacent values.

Last but not least, indicators of warfare motivations are those related to military expenses (MilitaryExpenditure), retrieved from the World Development Indicators database [60] and in line with related works (recall Section 2). It includes current and capital expenditures of the armed forces, defense ministries and other government agencies, paramilitary forces, and military space activities. In this case, data is again not provided on a yearly basis and the same approach as for economic features’ annual gaps has been applied.

3.2.2. Linear Models

The final step is the identification of relationships between GPE issues and APTs, which is achieved by computing linear models based on data from each victim/attacked country. Models are developed based on Equation (4), where G, P, and E are GPE factors, and the predicted variable is the amount of STs. In this way, CTI can benefit from this analysis by understanding the relationship between cyberattacks and GPE factors, thus answering RQ1.

Besides, the motivations of cyberattacks and affected sectors are identified to answer RQ2. This is also useful to assess the consistency of the previous model, as GPE factors and sectors at stake should be aligned. For example, if economic issues are the most prominent GPE factor, it should be more reasonable to attack the financial sector rather than nursery schools. Similarly, defense-related institutions can be regarded as a means to conduct cyberwars. A taxonomy of sectors and their related motivations has been applied (available in GitHub repository). Considering these factors, the analysis of motivations is carried out except for North Korea, as it does not disclose any economic or warfare indicator.

4. Results and Discussion

Leveraging collected data, models to study the relationship between GPE issues and used STs are introduced in this section. Depending on the target relationship, the whole set of countries or a subset of them come into play. As a result, the model selects the variables that better explain cyberattacks, that is, maximizing the adjusted R2.

The relationship between GPE issues and cyberattacks, related to RQ1, is addressed in Section 4.1. Afterwards, the underlying motivations related to RQ2 are introduced in Section 4.2. Lastly, a summary of the results and the limitations of the work are discussed in Section 4.3.

4.1. Relationship between GPE Issues and Cyberattacks

Tables 5 and 6 present a summary of the developed models for each country as attacker or victim, respectively.

In general terms, the model shows a substantial support for launched cyberattacks considering GPE factors in most countries. As such, cyberattacks from RUS, IRN, and USA count on the highest support. It must be noted that the case of RUS is noteworthy, since the amount of used STs is quite extensive with more than 200 cases.

The situation is even better in terms of the received cyberattacks. Our results show that the considered factors provide with great support. Interestingly, USA has received more than 180 cyberattacks and the model supports them with a factor of 0.82. On the other side, the lowest support is for the attacks received by IRN. However, it is an exception, since the remaining countries are beyond 0.7.

4.2. Analysis on Motivations

The following sections study motivations of cyberattacks per country, including a consistency analysis, as well as devising motivations per attacker on each victim.

4.2.1. Motivations per Country

In order to understand the relevance of each motivation per country, a linear model is built by only considering the variables related to each GPE factor (recall Section 3.2.1(2)). Table 7 summarizes results considering all countries. In general terms, most countries show strong prevalence of strategic and economic issues when launching cyberattacks. Indeed, China and Russia achieve similar support rates in both matters. The case of Russia is in line with prior expectations [58]. Similarly, Iranian STs have also been aligned with strategic issues as their main focus is on domestic regime stability [63]. On the contrary, Chinese STs have been regarded as more economic-driven in support of the country’s five-year plan [64].

Last but not least, warfare issues are not relevant for most countries except from Russia and USA as attackers and Vietnam as victim. The most notable result is Russia as attacker, which is probably because one of its most noteworthy APT groups is linked to a military intelligence service [65]. Similarly, the warfare interest of USA might be explained by considering that its APT group (called Equation) is allegedly linked to the US National Security Agency.

4.2.2. Consistency Analysis on Motivations

To further confirm the strength of these motivations, victim sectors are also considered. It is expected that the choice of target sectors is also aligned with the pinpointed GPE sectors.

Based on studied reports, Table 8 presents the percentage of sectors in which each country has been attacker or victim. Most target sectors are strategic or diplomatic, followed by economic ones. Regarding the warfare sectors, results show their lower relevance. However, all countries have attacked or have been victims in cyberwar-related sectors at some point.

The consistency analysis is carried out based on the alignment between the number of targeted sectors and the models previously developed (recall Table 7). If the corresponding percentage of attacks for a particular GPE factor is the highest one and the model also reveals the highest R2 for such GPE factor, there is an alignment between both. The study reveals that there is a close relationship between economic and strategic variables, though, in many cases, the alignment is achieved. For instance, IRN has a 0.58 in the model as an attacker (Table 7) for strategic/diplomatic variables, and the results by sector (Table 8) show that IRN attacks more sectors within that category (57.01%). This is in line with prior works [66, 67] which point out IRN’s prevalent strategic interest, or VNM’s focus on strategy but with substantial economic interests [68]. Indeed, from the attacker perspective, CHN, USA, and RUS are the exceptions, because our model suggests an economic motivation in first place, while sectors point out a higher strategic one. Concerning CHN, it is interested in increasing its technological level through industrial espionage and thus increasing its economical position [69]. Moreover, economy is a priority in USA, though strategic issues are also an important matter [70]. Lastly, the case of RUS is surprising for the low prevalence of economic sectors. However, Russian cyberattacks are launched against other states with preexistent rivalry [71] and thus strategic/diplomatic issues as pointed out by the model.

Concerning the victims’ perspective, results are consistent except for IRN, RUS, and VNM; the model points out that the main motivation is economy, but the targeted sectors are mainly strategic in nature. Nonetheless, in line with the model, the relevance of economic sectors is notorious in these cases, so it may represent that their attackers are aiming to steal information from economy-unrelated sectors that can later be transformed into economical assets.

4.2.3. Motivations per Attacker on Each Victim

To complete the analysis of the motivations for each country, it is also necessary to study their attacks against other target countries. For this purpose, models are developed for pairs of attackers and victims. Results are presented in Table 9, where suffixes C1 and C2 represent attacker and victim-related variables, respectively. For the sake of soundness, only those attacker-victim pairs with more than 15 used STs have been considered.

On the one hand, the situation between IND and CHN has recently been highlighted, although their tensions have arisen from a long time now [72]. Our results show that there is some support between GPE issues and cyberattacks in their case. On the other hand, it is noteworthy that all studied countries attack USA, and most of them count on remarkable support considering the GPE factors. USA itself already pointed out that CHN, RUS, and IRN were among the three main actors that were leveraging STs for cyberespionage with economic interests [73]. Our results show that though strategic factors seem to prevail, economic issues are at stake in most countries. This is consistent with the previous models (recall Tables 7 and 8).

4.3. Summary and Limitations

In light of the results achieved from the models, and in line with the research questions, it can be concluded that there is an undeniable relationship between GPE factors and cyberattacks (RQ1). Moreover, it has been shown that strategic issues are the most relevant GPE factor to launch cyberattacks but very close to the economic ones (RQ2). Our results are mostly in line with prior works that addressed the motivation for studied countries.

Beyond qualitative statements of motivation of APTs, which are quite common (e.g., Threat Group Cards produced by Thailand’s Computer Emergency Response Team [74]), our work is the first in providing quantitative measurements in this regard. This is beneficial for CTI for two reasons. The first reason is that it expands the horizon when it comes to solving the attribution of a cyberattack; GPE factors may serve as a hint to differentiate between different candidate attackers. The second reason is that monitoring GPE factors may be helpful to better predict future APT-related cyberattacks.

Despite the relevance of these results, it must be noted that our findings may be limited for several reasons. On the one hand, only a subset of the most representative APT groups have been analysed. Therefore, cyberattacks launched by other groups could alter the results.

A second limitation is related to the number of countries at stake. Our sample is representative as it covers the most active countries in terms of APT-based cyberattacks. However, the inclusion of additional countries is left for future work. Thirdly, the considered period of activity for each group and the current status of the media coverage as gathered by GDELT may impact the model. Indeed, a sensitivity analysis would be beneficial to assess the long-term stability of our findings.

A fourth limitation is related to our consistency analysis. It relies upon a set of sector-motivation associations that have been proposed in this paper. Therefore, different associations (e.g., including secondary motivations) could impact the degree of consistency.

Last but not least, our models do not capture eventual indirect cyberattacks in which a country targets another one by attacking some of the target’s allies or when the attack is carried out by a country which acts as proxy of the actual attacker. Nevertheless, including these events could decrease the strength of our model, since the attribution and intent of cyberattacks are not straightforward. Therefore, additional assumptions should be added to determine if a cyberattack was directed against the actual victim or against another third party. In this work, we have opted for sticking to evidence provided by the studied reports. The only assumption taken relies on the connection between sectors and GPE factors, but we believe it is reasonable and it counts on an affordable error margin.

5. Conclusions

In the last years, the influence of international relations in nation-state cyberattacks has been pointed out. However, this influence has not been previously characterized. Similarly, the underlying intentions for these cyberattacks have been pointed out, but no actual proof on the strength of these attributions has been given. To overcome these limitations, this paper has proposed a method to jointly analyse a particular type of cyberattacks (APTs) and a set of geopolitical and economical (GPE) factors that can be at stake to understand the international relations. We have used linear regression models to identify the relationship between GPE factors and the incidence of APTs, allowing us to identify the key factors related to the existence of such attacks depending on the attacker and the victim. These results, along with the theoretical starting point of the hypotheses that the studied factors are an important driver of APTs discussed in the introduction, allow us to conjecture that there is indeed a relationship between cyberattacks and international relations. This makes sense also in view of the fact that it would be difficult to understand that the relation between factors and APT went in the opposite direction, that is, that APTs drove military expenses or HDI, to name a few. On the other hand, it is hard to point at any possible confounding factor responsible for a noncausal correlation between such variety of indicators and the APTs. Finally, our detailed analyses of each pair of countries involved suggest as well that these cyberattacks can be explained in light of economic, strategic, and cyberwar factors. All these considerations reinforce our conclusion that there is a likely cause-effect relationship between international relations (particularly GPE relevant indicators) and APTs. To the best of the authors’ knowledge, this is the first work addressing both issues together and, thus, it is a nice tool to help cyberthreat intelligence (CTI) teams in the understanding of studied relationships. Indeed, CTI teams may leverage these results for an enhanced attribution and even prediction of cyberattacks.

A plethora of future works can be devised. For example, our discovered relationship may be the steppingstone to build predictive models leveraging the status of international relations, so that potential cyberattacks may be identified beforehand, being especially useful for cyberthreat intelligence processes. Moreover, our models can be enriched with other remarkable groups. This will also be helpful to determine the long-term stability of the relationship between GPE indicators and APTs. On the other hand, our model may be enriched by considering indirect effects between countries, thus characterizing the influence of the so-called cyberproxies.

Data Availability

Data will be released in GitHub if accepted.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Madrid Goverment (Comunidad de Madrid-Spain) under the multianual agreement with UC3M (“fostering young doctor research”, CAVTIONS-CM-UC3M, DEPROFAKE-CM-UC3M) and in the context of the V PRICIT research and technological innovation regional program; by CAM by grant CYNAMON P2018/TCS-4566-CM, co-funded with ERDF; by Min. of Science and Innovation of Spain by grant ODIO PID2019-111429RB-C21 (AEI/10.13039/501100011033); and by the Spanish Ministerio de Ciencia, Innovación and Universidades-FEDER funds of the European Union support, under project BASIC (PGC2018-098186-B-I00)