Abstract

Wi-Fi networks almost cover all active areas around us and, especially in some densely populated regions, Wi-Fi signals are strongly overlapped. The broad and overlapped coverage brings much convenience at the cost of great security risks. Conventionally, a worm virus can infect a router and then attack other routers within its signal coverage. Nowadays, artificial intelligence enables us to solve problems efficiently from available data via computer algorithm. In this paper, we endow the virus with some abilities and present a dedicated worm virus which can pick susceptible routers with kernel density estimation (KDE) algorithm as the attacking tasks automatically. This virus can also attack lower-encryption-level routers first and acquire fast-growing numbers of infected routers on the initial stage. We simulate an epidemic behavior in the collected spatial coordinate of routers in a typical area in Beijing City, where % routers are infected in hours. This dramatical defeat benefits from the correct infection seed selection and a low-encryption-level priority. This work provides a framework for a computer-algorithm-enhanced virus exploration and gives some insights on offence and defence to both hackers and computer users.

1. Introduction

Wi-Fi technology, dating from the , has proceeded in an explosive manner. Nowadays, almost all electronic devices are equipped with Wi-Fi modules and the number of routers has increased rapidly and complementarily. Widely deployed routers, with broad radiating areas, have their Wi-Fi signals overlapped in free space and, on the other hand, people usually set vulnerable passwords for convenience. Wi-Fi networks provide a target-rich epidemic spread platform for cybercriminals.

Traditional attacks are to plant viruses or worms having malicious or fraudulent motivation on personal computers [13]. Actually, Wi-Fi routers are perfect target platforms since routers are always on and connected to the Internet with a usually low security level or sometimes even no firewall software. Routers emit Wi-Fi signals all over the space within the range of tens of meters. The relatively close proximity is appreciable enough to perform an attack in such a densely populated environment [4]. In infection dynamics, an end user is infected as a seed or a broiler where the worm virus analyzes the Wi-Fi router and probes potential devices in the coverage. In this manner, worm viruses can spread through the router network [5].

Flaws in wireless protocol or misconfiguration of the access devices [6] are potentially utilized to control the Wi-Fi routers. As is popularly known, a public trap Wi-Fi router is set up to provide a free Internet and attract connection. Once one user falls into the trap, several types of attack, including man-in-the-middle attack and denial-of-service attack, can be conducted by virtue of the infected router [7]. In late , the four-way handshake, said to be free from attacks, is vulnerable to a key reinstallation attack [8].

An increasing trend to investigate epidemic spread models in networks follows the explosive increase in the amount of Wi-Fi routers and mobile terminals. Mobile phone virus via multimedia messaging services is presented and it is predicted that viruses will break out when mobile phone market reaches a certain threshold [9]. A Susceptible-Infected-Recovered (SIR) model is constructed to simulate the spread of hypothetical Wi-Fi malwares in real-world router locations and the Wi-Fi networks are demonstrated to be potential and vulnerable platforms [10, 11]. In the network [12], the scale of access point connectivities in the victim population is a more important factor than others [4]. An enhanced model takes the vulnerabilities in Wi-Fi routers and protocol into consideration but no big progress in dynamics is made [6]. Inclusion of end terminals leads to a different epidemic spread model [13].

Recently, a diversity of interesting results on this epidemic spread model [1421] has been demonstrated; however, plentiful fascinating works, including barely developed viruses endowed with abilities, need to be exploited. Thanks to the great advance of algorithm and computing capability, artificial intelligence develops rapidly. Viruses are destined to have the ability of identifying the environmental property from acquired data and make the optimal decision. Specifically, a virus chooses the appropriate router as the seed to begin its infection process according to the local router information. The victim candidate it prefers should be a router located in a crowded region (or a hub) with an outward-spreading potential. Kernel density estimation (KDE) algorithm estimates the probability density distribution directly from a set of spatial data without a prior distribution assumption [22]. This algorithm serves a simple and visualized approach to the selection of the infection seed.

2. Methods

2.1. Epidemic Spread Model Illustration

The epidemic model is established based on the following simplification. As shown in Figure 1(a), a router with no encryption can be infected directly in and routers with encryption are usually divided into two types, WEP and WPA/WPA2. WEP-encrypted router can be broken when it is attacked for and then follows the password crack dynamics. The attacker attempts to crack the router with the simple password library in . There are two cases after that: the router is infected with the probability () or with the attacker has to change to crack it with the complex password library in if it is not successful. Then again there will be two cases; that is, either the router is infected in () or not infected in (immune to attackers). WPA/WPA2 encryption has long been thought of as immune to attackers until the work in 2017 appeared [8]. An analogy is made with our model that the WPA/WPA2 encryption will break down in and the password will be cracked in with the probability of a successful infection ().

Typical time scales shown in Table 1 refer to the previous literature in the year , also in consideration of the computing capacity leap in recent years [10]. The probability distribution of encryption types (, , and shown in Table 1) are normalized from the data displayed in Figure 1(b) without the category “others.”

2.2. Data Acquisition and Processing

We collect pieces of raw data in a region in Beijing City (roughly latitude: N; longitude: E) from the website wiglet.net. We attempt to clean these data in two steps. First, some data with extremely unrealistic location information is deleted (e.g., Router ). Second, we delete the duplicate information whose data does not exist via identifying the Media Access Control (or MAC) address despite the same location information. After cleaning the acquired data, we label each router with an encryption type according to the ratio the website provides. Subsequently, we pick each router and collect router information within its radiating radius to construct this router’s infection candidate set. At last, the encryption types are sorted from no encryption to low-encryption type and high-encryption type to determine the infection order.

3. Results

In this paper, we assume a dedicated worm virus which can pick a more susceptible router region from the router network with a KDE algorithm and carry out the whole attacking procedure automatically, ranging from searching victims to installing the malware. The infection model refers to the previous literature where malicious worm is spreading directly from one wireless router to another via free space wireless propagation [10]. This virus can also query the encryption information, such as No Encryption, Wired Equivalent Privacy (WEP), Wi-Fi Protected Access (WPA/WPA2) encryption protocol, and attack routers with relatively low-encryption-level routers first at the same condition. The virus is triggered to stop attacking to enhance the task efficiency if the attacking duration reaches the predetermined threshold.

3.1. Real-World Network Characterization

We sample the real-world geographic location data for wireless routers from the wireless network mapping site (wigle.net). The detailed data acquisition and processing are shown in the Methods section. For notational convenience, routers are labelled in an identifier number.

In the Wi-Fi network, the radiating coverage of a router, ranging from tens of meters to more than a hundred meters [23], depends strongly on both the internal factors (such as the radiating power and the antenna orientation) and the external factors (such as local barriers and signal interference). For simplicity, we keep the radiating radius as constant and consider four different values of the maximum radiating radius which are , and to analyze the degree distribution [24] in real-world router network. Figure 2(a) describes the probability distribution that there are other routers locating within the range . The Wi-Fi network, whose degree distribution follows an exponential decay, is a scale-free network when is set to [25]. Note that, with an increased , the distribution becomes more and more flat, which indicate the fact that a larger radiating radius will overcome geographic obstacles.

Local interconnectedness can be characterized by the clustering coefficient where represents the fraction of the neighbors of Router that are also interconnected. It is mathematically expressed as , with indicating the number of links bridging neighbors of Router and representing the number of all possible connections between these neighbors. In Figure 2(b), we compare clustering coefficients of a random generated graph and the real-world region in Beijing City and the shadows indicate the corresponding error bars. The results show that the real-world network has a stronger clustering property than a random one. It makes sense that a network with larger clustering coefficient is vulnerable to the epidemic virus.

3.2. Algorithm-Enhanced Hunt for Infectious Source

With the knowledge of the network property, we could inject the worm virus into the network purposefully. Our dedicated virus acquires spatial distribution information of routers from some interface first and analyze the appropriate injection candidate [26]. What viruses prefer should be a router located in a crowded region (or a hub) with an outward-spreading potential. A simple and visualized approach to find these routers is the KDE algorithm, which can estimate the probability density distribution directly from a set of given location data without a prior distribution assumption from a macroscopic perspective [22].

In Figure 3(a), we map all position information directly on the map with brighter color indicating more routers and darker color indicating fewer ones and in Figure 3(b) we generate a two-dimensional matrix sized . Each matrix element means the number of routers in the corresponding area of about and three levels of router density are characterized in different colors. A virus learns the network from the reduced matrix. It finds a connected giant component and checks its neighbor environment. The right candidate is the one lying in the largest giant components and having the outward-spreading potential and it itself has enough neighbors to attack. In Figure 3(b), the identifier number and the location of some routers are labelled in blue and green, respectively. We choose six different sets of infection seeds and the relation between the attack rate and evolution time is shown in Figure 3(c). If the virus is seed only in single router among the four routers (11, 60, 814, and 2229), Router shows the highest attack rate within the shortest time. Note that Routers and are in isolated regions and Routers and are in the same largest clusters. This gives also the reason for different attack rates. The trend of Router on the initial stage is flat, for small amount of seed neighbors do not accumulate enough broilers. From this plot, we conclude that an appropriate selection of initial seed can dramatically influence the attack rate and efficiency. Specifically, a randomly selected infective seed probably drops in an isolated region and has a restricted spreading region. There is also possibilitiy that the seed has a small amount of neighbors and spreads slowly in the initial stage, though in a large cluster.

3.3. Visualization of the Epidemic Dynamics

To visualize the time evolution of the epidemic behaviors, we set the radiating radius to and the virus picks the router set 60,814 and 2229 as the initial infection seed. Figures 4(a)4(f) present the epidemic behaviors where infected routers are labelled yellow and the infection time and attack rate are displayed on the top left. The overall attack rate is shown in Figure 4(g). We see that % routers are infected in just hours and % are infected in hours. We also present the concept of threads, which means the number of viruses that are attacking routers concurrently in a given time. The gray curve in Figure 4(g) shows the number of threads changing with infection time. A sharp exponential increase from to more than in the first half hour is attributed to the dedicated virus attacking routers with no encryption within its range as the first priority and a good choice of initial broiler seed is set. This rapid expansion in the beginning lays a foundation of high infection efficiency.

4. Discussion

This paper provides insights on the offence and defence of a dedicated virus. As a hacker, to develop an efficient algorithm and to choose an appropriate initial seed can make the infection efficient. As a user, to have a router with higher encryption level can dramatically reduce the risk. We simulate this process in Figure 5 where and increase with a strengthened password. The upper dashed line shows the current situation ( and ). When and are gradually increased until and , the attack rate will be reduced to the lower dashed line. From this set of curves, we see a distinct suppression in every increase of the password strength.

5. Conclusion

In conclusion, we collect raw information of router location in a region in Beijing City and analyze the property of Wi-Fi networks, such as the degree distribution and the clustering coefficient, indicating that the real-world Wi-Fi network is vulnerable to the infection. In this paper, the biggest selling point is that we endow the virus with some abilities and present the dedicated worm virus which can pick a more susceptible router region with the KDE algorithm and perform the attacking tasks automatically. This virus can also search the encryption types of routers within its range and attack lower-encryption-level routers first. In this way, the virus gains a rapid expansion in the beginning and % of routers are infected in only hours. We also present the concept thread to interpret the reason of high infection efficiency of our dedicated virus than a normal one.

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Disclosure

Electronic address is [email protected].

Conflicts of Interest

The authors declare no conflicts of interest.

Authors’ Contributions

Yi-Hong Du conceived the work. Yi-Hong Du collected the raw data and designed the flow diagram. Shi-Hua Liu analyzed the data and performed simulation. Shi-Hua Liu and Yi-Hong Du both wrote the paper.

Acknowledgments

This research leading to the results reported here was supported by the Scientific Research Project of Zhejiang Provincial Education Department (No.Y201329845).