Abstract

To solve the issue of measuring the risk of the application-layer collusion privilege escalation attacks in Android apps, this paper proposed a risk measurement method based on the feature weight and behavior determination. Analytic hierarchy process (AHP) is used to calculate the weight of feature in the feature set extracted from the app. App behavior and attack behavior are modeled by process algebra. The weak equivalent and nonequivalent are introduced to determine the behavior of apps, whereas the measurement function is constructed to calculate the app risk measurement value. In an experiment with three known apps, the measurement values are 0.629, 1, and 0.976. These results are consistent with reality, and the effectiveness and feasibility of the proposed method are verified. Through the benchmark and test set experiments, it can be seen that the measurement value of apps that has weak equivalent to attack behavior is distributed between 0.0468 and 1, and the measurement value distribution is reasonable, which verifies the accuracy and rationality of the method.

1. Introduction

With the rapid development of mobile Internet, the Android platform is widely used in all aspects of people’s work, study, and life. Therefore, the Android platform has become a research hotspot [1]. At the same time, more and more personal privacy information is saved in smart devices, the value of devices and information is consequently increasing, and attacks against the Android system and Android apps are frequently occurring [2]. Reference [3] indicated that Android devices accounted for 47.15% of the monthly infection rate of mobile networks in 2018. In the third quarter of 2019, 365,000 new malicious programs were added to the Android platform, of which privacy theft accounted for 17.9%, and the app’s unauthorized actions are increasing [4, 5]. This paper discusses the privilege escalation attacks between multiapplications used for theft of users’ privacy information through gradual refinement of permissions. Such attacks are more covert, difficult to detect through a single application, and more threatening [6].

At present, researchers use static, dynamic, and hyper methods to detect privilege escalation attacks. A method of protection against privilege escalation attacks is the improvement of the Android permission mechanism, access control mechanism, and security framework. However, for the detection and protection of collusion in privilege escalation attacks that constitute multiple applications, there is a lack of analysis and detection of communication between multiple applications, in particular, the lack of effective detection and protection of a single application. Worse yet, most existing security solutions become costly or rigid in recent Android dynamic permission environment [7]. Therefore, the risk measurement of privilege escalation attacks becomes a new solution. Although researchers also put forward many classic solutions to the risk measurement of software and apps, there are the following problems: ① risk assessment is conducted from the overall ability or third-party evaluation of the app, but the detailed features of the app are not considered and ② risk assessment of app behavior is only based on one feature, such as permission or API call, but not multiple features.

Therefore, it is especially important to measure the risk of a single application that can constitute multiapplication collusion privilege escalation attacks. Because of limitations of the current situation about the study on the risk measurement of collusion privilege escalation attacks, this paper proposes a risk measurement method for application-layer collusion privilege attacks using feature weight and behavior determination:(1)Construct the feature set of application-layer collusion privilege escalation attacks: based on analyzing the model of privilege escalation attack, a feature set is constructed, which is composed of six features: application’s dangerous permission, component’s dangerous permission, sensitive API call, sensitive data acquisition, component Intent communication, and sensitive data transition.(2)Feature weight calculation: AHP is used to calculate the weight of each feature in feature set, which provides basic data for app risk measurement.(3)App behavior modeling and equivalence relationship determination: process algebra is introduced to model the behavior and attack behavior of the app. The concept of weak equivalence and nonequivalence is used to judge the behavior of the app.(4)Calculate risk value of the app: based on the construction of the risk measurement function, the risk measurement is carried out for the app which is weak equivalence to the attack behavior.

The proposed method can effectively measure the degree of risk of weak equivalence apps with the attack model. The main contributions are as follows:(1)The attack behavior feature set is constructed, and the weight of each feature is given. Static extraction methods were used to extract six features of the app (application’s dangerous permission, component’s dangerous permission, sensitive API call, sensitive data acquisition, component Intent communication, and sensitive data transition), and the weights of six features are calculated by AHP. The weights were 0.033, 0.313, 0.107, 0.089, 0.150, and 0.308, respectively. It makes up for the limitation of not considering the detailed features or only one feature when evaluating the overall behavior of the app.(2)Behavior modeling and determination: process algebra is used to model the application behavior and attack behavior, and weak equivalence concept is used to decide the equivalence relationship. Our method only measures the risk of the app which is weak equivalent to the attack behavior, which can save the cost of measurement.(3)Feature weight and behavior determination are combined to measure the risk of collusion privilege escalation attacks in the first time, and the risk measurement function is constructed to test the case, benchmark test, and actual APK test set. Through experiments and comparison, the accuracy and validity of the method are verified.

Android malware detection and protection has always been a hot research topic. Because the early signature-based method for detecting malicious code [8] cannot detect the unknown malicious behavior, at present, it is mainly based on the app behavior detection method [913]. These methods mainly focus on the detection of malware on system calls, control flow, data flow, permission information, sensitive API, and other behavior features, and the method of multidimensional feature behavior portrait is used to detect malware. The authors of [1417] proposed machine learning technology, deep learning technology, and unstructured user input technology for detecting the malicious Android app. Because privilege escalation attacks are characterized by strong concealment and camouflage, the detection of malicious behavior of the app cannot detect the privilege escalation attacks very well. Therefore, scholars have also proposed many theories and methods for better detection and protection. For the kernel-level privilege escalation attacks, scholars have enhanced the access control framework and monitored the permission information and other methods to prevent attacks [18, 19]. For the application-layer collusion attack discussed in this paper, the authors of [7, 2023] proposed extended forced access strategy, monitoring system calls, and restricting dangerous interapplication communication to prevent the attacks. Although researchers have achieved good research results in the detection and protection of malware and privilege escalation attacks, collusion privilege escalation attacks involve collusion attacks between multiapplications, and the malicious detection of a single application cannot detect collusion privilege escalation attacks very well. Therefore, risk measurement becomes a new way to solve the collusion privilege escalation attacks.

Meanwhile, scholars have achieved many research results in the measurement of software security and credibility. From software requirement [24], behavior [25, 26], capability [27], and software development process [28], software credibility analysis, modeling, and measurement are carried out. The method of multiattribute decision making is widely used in engineering, technology, and economy and is also used in software credibility analysis [29, 30]. Using the research results of software credibility measurement for reference, the research on risk measurement and credibility of Android app has made remarkable progress. The study in [31] proposed a risk assessment method based on permission and functional relationship analysis and standardizing Android developer permission applications. It can recommend reasonable permission configurations for users in real time, reducing the risk of permission abuse. However, it only considers the risk of using permissions in function implementation. And, reference [32] defined the criticality concepts of Android permissions according to the abuse of permissions by malware and put forward the standards for measuring application security risks. Although this standard has advantages, it is too simple to measure the risk of collusion privilege escalation attacks. Because this standard is primarily concerned with permission risk. Xu et al. [33] proposed an Android application security credit index measurement method based on AHP combined with the certification strength of Android software and violation records in the third-party application market. This method considers the violation records of the third party, so credibility evaluation is more objective. Different from the measurement of certification strength and violation records in reference [33], Li et al. [34] analyzed the behavior feature of the app itself and proposed a sandbox-based risk assessment method based on the app’s behavior on Android to dynamically monitor and record activity. This reference, however, only focuses on the API calls of the app. The study in [35] proposed a framework layer integrity measurement method for the representation and usage of the framework layer of the Android system. This method can effectively guarantee the integrity of the framework layer code and the integrity of the runtime of the Android system. However, for the apps that can constitute collusion privilege escalation attack, because these apps are all security when measured alone, the method is not effective for collusion privilege escalation attacks.

The risk measurement of software and the app only focuses on at the macro violations or a feature, so the effect of risk measurement is not good for the application-layer collusion privilege escalation attacks. The main problems are as follows:(1)The multifeatures of the attack are not considered fully. The existing app risk measurement methods mainly focus on the overall behavior or a feature of the app (such as permission, API call, and so on). The attack mode of collusion attack we discussed is covert, so we need to consider many features when we evaluate the risk of collusion privilege escalation attack.(2)There are few measurement methods for collusion privilege escalation attacks. Because every app that constitutes collusion privilege escalation attacks is normal and safe, the existing measurement methods must fail to measure those attacks, so it is necessary to build a special measurement method to effectively measure collusion privilege escalation attacks.

Therefore, it is particularly important to measure the risk of a single application constituting a collusion privilege escalation attacks. The semantics of process algebra is used to describe and determine the behavior of the app [24], and AHP is adopted as a decision-making method which combines qualitative and quantitative analysis [36], and then a risk measurement method of collusion privilege escalation attacks based on feature weight and behavior determination is constructed.

3. Attack Model Analysis and Feature Extraction

3.1. Attack Model Analysis

Utilizing the drawbacks of the Android system architecture in its permission mechanism and interapplication communication, multiple applications can conspire to attack. This problem of the Android platform has been described by the model of multiapplication collusion in the application-layer [37].

In the multiapplication collusion privilege escalation attack, an application component with less permission can access the components that have more permissions than it by using interapplication communication and permission refinement. In this way, the application can obtain permissions that it does not have in order to steal private data. In order to study privilege escalation attacks, three normally independent apps have been designed to construct the case of privilege escalation attacks [38]. The key codes for the three applications are presented in Table 1.

The diagram of privilege escalation attacks case can be obtained from Table 1.

As can be seen from Table 1 and Figure 1, a multiapplication collusion privilege escalation attack is composed of six typical features: dangerous permission of the app, dangerous permissions of components, sensitive API calls, sensitive data acquisition, component Intent communication, and sensitive data transition.

3.2. Construction and Extraction of Feature Set

The attack behavior feature set AF is composed of six typical features of a privilege escalation attack.

AF = {AF1, AF2, AF3, AF4, AF5, AF6} where AF1 denotes a dangerous permission of the app; AF2 denotes a dangerous permission of the component; AF3 denotes a sensitive API call; AF4 denotes a sensitive data flow acquisition; AF5 denotes a component Intent communication; and AF6 denotes a sensitive data transition.

Android APKs were decompiled by Apktool to smali files. Smali files can be analyzed to obtain information, such as class, inheritance, interface, and function call. Meanwhile, the feature set can be extracted with the information description file AndroidManifest.xml of the app.

3.2.1. Dangerous Permission of the Application

Extract the permission according to the tag <uses-permission> in the AndroidManifest.xml. Then, summarize the dangerous permissions of the app according to the dangerous permission list provided by Google.

3.2.2. Dangerous Permission of the Component

Extract the registration component information and the component’s <intent-filter> information according to the <activity> </activity>etc. tag in the AndroidManifest.xml. Then, extract the permissions applied by the component according to “android: permission = XXXX,” and finally, summarize the dangerous permissions applied by the component according to the dangerous permission list.

3.2.3. Sensitive API Call

Extract system call sequence with strace of Android SDK. Then, combine it with conclusions from reference [39] to obtain sensitive API calls.

3.2.4. Sensitive Data Flow Acquisition

Obtain sensitive information source <source> by using FlowDroid [40], app resource files, and smali files.

3.2.5. Component Intent Communication

Extract the action name of components according to <intent-filter> in 3.2.2, and obtain component Intent communication information according to the function calls (e.g., getExtras and putExtras) in smali file.

3.2.6. Sensitive Data Transition

Extract the flow path from source to sink by FlowDroid [40], and obtain sensitive data transition. The source acquires sensitive information while the sink sends sensitive information.

4. Feature Weight Calculation Based on AHP

AHP [41] can analyze the weight of the factors related to decision-making and can effectively distinguish the influence degree of different features in the process of constructing the pairwise comparison matrix A. AHP can be well applied to the analysis of the features of collusion privilege escalation attacks with complex attack features and strong concealment. At the same time, reference [33] shows that AHP can effectively analyze the influencing factors.

4.1. AHP Model Introduction

Two important steps in this method are the construction of the pairwise comparison matrix and consistency test.

4.1.1. Construction of Pairwise Comparison Matrix

(1)There are n factors at a certain level, ; in order to compare the influence of X on a certain objective, the degrees of influence of n factors on a certain objective are ranked. The quantified relative weight is introduced to describe the comparison result of factors i and j for the objective. Suppose that there are n factors in the comparison. Then, the pairwise comparison matrix can be described as follows:

where ranges from 1 to 9 and the corresponding reciprocal. ① means that factor i is equal to factor j in influence; ② means that factor i is slightly stronger than factor j in influence; ③ means that factor i is stronger than factor j in influence; ④ means that factor i is obviously stronger than factor j in influence; ⑤ means that factor i is absolutely stronger than factor j in influence; and ⑥ , means the influence of factor i relative to factor j is between and .(2)The maximum eigenvalues and eigenvectors can be obtained by the column normalizing the pairwise comparison matrix. The weight of the factors can be determined after the consistency test.
4.1.2. Consistency Test

Consistency index, random consistency index, etc. were used to calculate the consistency ratio so as to test the matrix A(1)Consistency index (CI)where n is the sum of the diagonal elements of the matrix A and λ is the maximum eigenvalue.(2)Random consistency index (RI)

The values of RI are presented in Table 2.

(3)Consistency ratio (CR)

If , the degree of inconsistency of the pairwise comparison matrix A is within the acceptable range, and its normalized eigenvectors represent the weight vector. If , the pairwise comparison matrix A needs to be reconstructed.

4.2. Feature Weight Calculation
4.2.1. Construction of Pairwise Comparison Matrix

According to the method introduced in Section 4.1, the scale comparisons of features in AF are presented in Table 3.

According to Table 3 and formula (1), the pairwise comparison matrix can be created as follows:

4.2.2. Consistency Verification and Weight Calculation

By normalizing the column vectors of formula (4), the following results were obtained.(1)Maximum eigenvalue(2)Maximum eigenvector(3)Consistency verification

According to formulas (2) and (5) with n = 6, CI is calculated as follows:

Then, from Table 2, it can be seen that RI = 1.24. According to formulas (3) and (7), CR is calculated as follows:

According to formula (8), it is verified that A is consistent.

Therefore, weight vector ω is obtained according to formula (6), and the weight of each feature in the feature set is presented in Table 4.

5. Behavior Modeling Based on Process Algebra

Process algebra [42] can effectively describe the interaction between two systems and determine the equivalence between them. In process algebra, the behavior of the system can be represented by actions and events. Therefore, it can effectively describe the permissions and message communication of the Android app. The Android app is composed of components; the behavior of components constitutes the behavior of the app. Therefore, according to the features of components, the behavior model of components and the behavior of the app can be built. At the same time, references [24, 25] show that process algebra can effectively determine software behavior.

5.1. Application Behavior Model

The Android app is composed of a series of components that can help users implement functions and services, such as interface building, data storage, and background running. The Android system consists of four major components, namely, Activity, Service, BroadcastReceive, and ContentProvider. Activities are used to present functions and interface presentations. Services provide background running services. BroadcastReceiver is used to receive broadcasts, and ContentProvider supports data storage and reading. Therefore, the behavior of the application consists of the behavior of all components above. The syntax and semantic specifications of process algebra based on the behavior features of components are defined as follows:

In the formula,(1) means that each component has a unique process identifier id that can be omitted according to actual condition.(2) is a summation, where represented the number of features in AF, represented the number of permissions, and represented the permission for the app. indicating that Featurej is protected by Pi, and Featurej must occur before Pi can start activity.(3) means component has features.(4), where means the action of message sending and means the action of message receiving. represent n sensitive data.(5) means that the application behavior represented by Feature is under P protection.(6) means the can be reused.

5.2. Privilege Escalation Attack Model

This paper focuses on the multiapplication collusion privilege escalation attack. Dangerous behavior occurs when dangerous components apply for dangerous permission. Therefore, the multiapplication collusion privilege escalation attack must be based on a certain dangerous permission and be completed by a component in the application. According to formula (9), the component attack behavior under permission P1 can be defined as follows:Here,(1) denotes the attack behavior set under P1 and represented the number of features in AF, .(2) means that the component has multiple features.(3), where denotes the action of message sending of the component and denotes the action of message reception of the component. represent sensitive data.(4) means that the behavior represented by AF in the application component is protected by permission P1.

6. Risk Measurement Method

According to the equivalence relationship of states in process algebra, two behavioral equivalence relationships are defined: behavior weak equivalence relationship and behavior nonequivalence relationship.

6.1. Behavior Transition

A set of actions, a state set of states, and there exists a subset T of called the transitions. That is, , then , called transition , witten as (or ).

6.2. Behavior Weak Simulation

Suppose is a binary relation in the state space of a component, represents any action by a feature of a component. S is a weak simulation if and only if, whenever PSQ, if , then there exists such that and ; if , then there exists such that and . This means that state P has passed a finite number of state transitions (even no transitions) to state Q.

Definition 1. (Risk behaviors’ weak equivalence). The binary relation S over process space is weakly equivalence only if S and its converse are both weak simulation. It is said that P and Q are weakly equivalent or weakly bisimulate if a weak bisimulation S exists such that PSQ, written as , where and . This means that there is certain equivalence between the component behavior in the application and attack behavior. Therefore, it is necessary to establish a measurement function for risk assessment.

Definition 2. (Risk behaviors’ nonequivalence). Whether to select any transition P, the matching transition Q cannot be found, that is, there is no strong (weak) simulation R. Then, P and Q are said to be not equivalent and written as . This means that there is no equivalence between the component behavior of the application and attack behavior. Therefore, its risk measurement value is 0.

Definition 3. (Risk behaviors’ measurement function).Here,(1) indicates whether the feature exists in the feature set. The corresponding feature of is AFi, . indicates that the i-th feature exists, and indicates that the i-th feature does not exist.(2) denotes the weight of the feature in the feature set, , and .Algorithm 1: app behavior determination and risk measurement. AFi denotes the feature set of the i-th app,, n denotes the number of apps in the test set, and P1 denotes a dangerous permission.

(1)Input:
(2)Output: measurmentValue
(3)Assumption: use BM point to comBehaviorModel, use AM point to comAttackModel
(4)Initialization: BM ⟶ comBehaviorModel1, AM ⟶ comAttackModel1
(5)For each component has AFi
(6)Construction comBehaviorModeli, comAttackModeli
(7)If Then
(8)call f (V, W)
(9)print measurmentValue = f (V, W)
(10)BM ⟶ Next, AM ⟶ Next
(11)ElseIf Then
(12)print measurmentValue = 0
(13)BM ⟶ Next, AM ⟶ Next
(14)EndIF

7. Experiment for Case

7.1. Feature Set Extraction

According to the attack case given in Section 3.1, AF of App1, App2, and App3 can be obtained by using the feature extraction method from Section 3.2, as presented in Table 5. T means AFi exists, F means AFi does not exist, and .

7.2. Risk Behavior Modeling and Risk Measurement
7.2.1. Behavior Modeling

According to formula (9), the behavior models of components Com1, Com2, and Com3 can be obtained.(1)The behavior model of the component Com1 of App1 is presented in formula (13), where data1 denotes sensitive data:(2)The behavior model of the component Com2 of App2 is presented in formula (14), where data1 denotes sensitive data and P1 denotes that the application has dangerous permissions:(3)The behavior model of the component Com3 of App3 is presented in formula (15), where data1 denotes sensitive data and P1 denotes that the application has dangerous permissions. ComP1 denotes that the component has dangerous permissions:

7.2.2. Attack Modeling

The attack behavior models of components Com1, Com2, and Com3 can be obtained using formula (10).(1)The attack behavior model of the component Com1 of App1 is presented in the following formula:(2)The attack behavior model of the component Com2 of App2 is presented in the following formula:(3)The attack behavior model of the component Com3 of App3 is presented in the following formula:

7.2.3. Equivalence Relationship Verification

The behavior of the app can be composed of its feature actions, feature states, and state transition. For example, the app has the behavior of message sending. The feature is message sending permission of the app. Action of the feature is waiting to message sending and sending message. Status is do not send SMS and send SMS. Transition is confirmed to send. Based on the definition of behavior transition, it can be seen that ① Psms represents sending message permission, ② a1 waits for sending message under Psms, ③ a1 sends message under Psms, ④ (a1, Psms) represents the action of waiting for sending message, ⑤ (, Psms) represents the action of sending message, and ⑥ λ represents status transition. Then,

According to the definition of weak simulation, the state set of the component is composed of the state sets similar to the left and right sides of formula (19). Let P and P′ are the behavior state set, Q and Q′ are the attack behavior state set, and the attack behavior state set of components must be included in the behavior state set. Based on the above theory, the weak equivalence between formulas (14) and (17) is verified by derivation.

For the convenience of description, Tables 6 and 7 are used to simplify the states represented by related features.

Construct the state transition diagram of formulas (14) and (17), as shown in Figures 2(a) and 2(b), where a, b, c, and d represent migrations.

Suppose that . Then,(1)p0 weak simulation q0 verification. , each migration of the first element q, can be migrated by a second element p matching (or a series of migration or even no migration). For example, for , q1 has , so , using matching. Therefore, s is a weak simulation, that is, p0 weak simulation q0.(2)In the same manner, the weak simulation relationship of each state in two state graphs can be verified.(3)According to the concept of risk behaviors’ weak equivalence Definition 1, S−1 can be verified as a weak simulation in the way of (1).(4)Therefore, the weak equivalence between the states in Figures 2(a) and 2(b) can be verified. Then, the weak equivalence between formulas (14) and (17) is verified.

To make our verification more automated, the Mobility Workbench (MWB) was chosen as the verification tool. After standardization according to MWB grammar [43], the above application model was transformed into MWB language. The behavior model and the attack model of App3Com3 (representation in MWB with AC) were proved to have a weak equivalence relationship. Figure 3 shows the verification process of weak equivalence between the behavior and attack behavior of App3Com3 implemented by MWB. Similarly, the following two weak equivalence relationships were also verified: (1) App1Com1 and App1Com1Attack and (2) App2Com2 and App2Com2Attack.

7.2.4. Extended Attack Modeling

In order to describe the presented method, it is assumed that there is no sensitive data flow for Com1 of App1.(1)The behavior model of the feature changed in App1Com1 is presented in the following formula:(2)The nonequivalence relationship is verified. According to the concept of risk behaviors’ nonequivalence Definition 2, Figure 4 shows the verification process of nonequivalence between the change behavior (represented by ACC) and attack behavior (represented by ACA) of App1Com1 implemented by MWB.

Therefore, according to Definition 3, the App1 measurement value of the changed feature is 0.

7.3. Risk Measurement for Case

It can be seen from the result in Section 7.2 that App1, App2, and App3 are weakly equivalent to the attack model. Therefore, according to the measurement method, formulas (11) and (12), and the feature set of Table 5, it is possible to obtain the feature existence matrix which that the existence of features in the feature set of Com1, Com2, and Com3.

The result of multiplying formula (21) by weight vector ω is as follows:

Based on formulas (11) and (22), the measured results of App1, App2, and App3 are shown in Table 8.

Therefore, we have the following:① The measured results of App1, App2, and App3 are 0.629,1, and 0.976.② App2 and App3 are more dangerous than App1. The measurement value is consistent with the actual risk of the app.③ App1 is used to collect sensitive information in a collusion privilege escalation attack. For a single app, its risk is indeed weaker than App2 and App3.

8. Method Evaluation and Validity Analysis

8.1. Method Evaluation

This proposed method can be divided into three steps: construction, extraction of behavior feature set and feature weight calculation, and app dangerous behavior modeling and decision and risk measurement. Among them, feature extraction is the basic of the method. The key steps are sensitive data flow and migration detection. The time complexity of the dangerous behavior determination algorithm is O (n). The calculation of measurement function mainly focuses on multiplication, addition, and logarithm. For each app which is weakly equivalent to the attack behavior, the computational cost of risk measurement mainly comes from matrix operation; there are . Fifty-three Android APKs were tested, and the time and space cost analysis data are shown in Table 9. All experiments were implemented on a PC platform with 8G memory and Intel(R) core (TM) i5-2520M CPU.

8.2. Benchmark Test and Method Comparison

DroidBench [44] is a set of open source real-life Android applications to be used as a testing ground for static and dynamic security and measurement methods. FieldAndObjectSensitivity, InterAppCommunication, and InterComponentCommunication are test set in DroidBench. The FieldAndObjectSensitivity has the sensitive data leakage, the InterAppCommunication has the information leakage in the communication between apps, and the InterComponentCommunication has the information leakage between components. The features of collusion privilege escalation attacks that may exist in the three test sets are shown in Table 10, where T represents the existence of the feature and F represents no-existence.

Table 10 shows that test sets cover all attacks features in AF. Different test objects in the test sets have different attack features. For example, some of the test objects in InterComponentCommunication do not exist for AF4 and AF5 features. The method in Sections 36 and formula (11) are used to verify the behavior equivalence and measure the risk. Table 11 shows the AF sets of each test object obtained by this method, the equivalence relationship, , and the measurement results, where T represents the existence of the feature and F represents no-existence. At the same time, the method proposed in reference [45] is used to measure the test objects, and the measurement results are shown in Table 11.

The measurement values of 22 benchmark objects show that the following:(1)The measurement value is between 0.8875 and 1, the distribution of measurement value is reasonable, and it matches the actual risk degree of the app.(2)The three test sets have information leakage problems of sensitive data, communication between apps, and communication between components. From the measurement values, it can be seen that apps with information leakage risk pose a greater risk of privilege escalation attacks. It can be verified that it is necessary to calculate the weight of different features by the relationship of and measurement value.

By comparing with the measurement results in reference [45], the following can be seen:(1)The distribution of measurement values in our method is more reasonable, which can better reflect the actual risk degree of apps. The measurement results of reference [45] are concentrated between 0.9975 and 0.9999, and the distribution is relatively concentrated, which cannot better distinguish the dangerous degree of these high-risk apps.(2)The measurement results of our method are consistent with the attack behavior features. In this method, sensitive data transition is added to the attack behavior feature set as attack behavior feature, which ensures the consistency of risk measurement value of the app with the same attack behavior features. However, in reference [45], the sensitive data transition was not considered in attack behavior feature set, but sensitive data transition was taken as a parameter in the measurement function so that it has the same attack behavior features but different measurement results, for example, the measurement results of objectsensitivity 1 and objectsensitivity 2 in Table 11.(3)Both methods ensure the effective measurement of the high-risk app. Through the measurement value of 22 benchmark test objects, it can be seen that the two methods can effectively reflect the risk degree of the app when they measure the app with high risk.

8.3. Test Set Composition

In order to ensure the diversity and comprehensiveness of the test set, 50 apps from the Android app market and 3 apps developed by the research team were selected for testing the feasibility and effectiveness of the proposed method [46]. There are 21 categories of samples, and each category contains 1–3 typical applications. The sample composition is presented in Table 12.

Because the same company or developer has the same development methods and coding habits, it is easy to cause the same defects in the developed apps, and the convenient conditions constitute the attack behavior among the developed apps. Therefore, the test set contains 13 apps developed by the same company or developer, as shown in Table 13.

8.4. Validity Analysis

The method presented in Section 3.2 was applied to extract the feature set of the APK in the test set. The method of behavioral modeling presented in Section 5 was then used to obtain the equivalence relationship of the application. The partial APKs from test set feature set description and results are provided in Table 14.

According to the measure algorithm and formula (11), the measurement values of APKs in Table 14 can be obtained. They are listed in descending order in Table 15.

The equivalence relationship of the application in the test set can be obtained by employing the behavior modeling method presented in Section 5. Its distribution is presented in Table 16.

The results of the equivalence relationship are consistent with the conclusions of [20, 47].

The risk measurement is carried out for the APK with weak equivalence relationship in Table 16 using this proposed method. Figure 5(a) shows the curve figure between the measurement value and the number of features in AF, and Figure 5(b) shows the measurement value of 39 APKs.

The method proposed in reference [45] is used to measure those 39 APKs, and the measurement results are shown in Figure 6. The measurement values are mainly concentrated in the range of 0.8647–1. Compared with the proposed method, the distribution of measurement values is too centralized.

Among the 13 APKs involved in Tables 13, 10 are weakly equivalent to attack behavior. Among them, the measurement value of 3 APKs is 1 and that of 1 APK is 0.967. The detailed results are shown in Figure 7.

Figures 57 show the following:(1)The ranges of measurement values are reasonable, which conforms to the risk degree of APK. From the comparison between Figures 5(b) and 6, it can be seen that the measurement value of weak equivalent APKs to attack behavior is in range from 0.0468 to 1 by using the proposed method. And, the measurement value is distributed between 0.8647 and 1 by using the method in reference [45]. Therefore, the measurement value distribution of the proposed method is more reasonable. It also shows that the weak equivalent APK has a strong or weak risk measurement, which is consistent with the actual of APK risk.(2)The consideration of feature weight is more important. It can be seen from Figure 5(a) that the higher the number of features in AF, the higher the risk value of APK. Meanwhile, the specificity in this figure also proves the importance of feature weight. That is to say, there are many features in this figure, but the measure value is not as high as the APK with fewer features. Therefore, it is more important to consider the weight of each feature. For the features in AF, not only the number but also the weight of each feature should be considered. However, reference [45] does not consider the feature weight and it only considers the number of features. From the measurement values in Figure 6, it can be seen that the differentiation of risk degree is not obvious.(3)Apps developed by the same company or developer are more threatening. According to the conclusion of Figure 7 and reference [45], it can be seen from Figure 7 that 76.9% (among the 13 APKs of the same company or developer, 10 of them are weakly equivalent, accounting for 76.9%.) of APKs developed by the same company or developer have collusion privilege escalation attack risk, accounting for 25.6% (among the 39 weakly equivalent APKs, 10 of them belong to the same company or developer, accounting for 25.6%.) of the risk APKs in the test set, and their measurement values are high. Therefore, for the same company or developer development, it is necessary to strengthen code management and security education to prevent the occurrence of collusion privilege escalation attack due to development loopholes and intentional acts.(4)The method of combining feature weight and behavior determination is more accurate and reasonable. From Figures 5 and 7 and comparison with Figure 6, it can be seen that the measurement value is reasonably distributed between 0 and 1, and the measurement value is higher for APK with more features and high weight.

9. Conclusion

In this paper, a risk measurement method based on feature weight and behavior determination is proposed. The attack case, benchmark test set, and the actual APK test set are verified by experiments. The main lessons learnt from the proposed method are as follows:(1)In this method, AHP is used to calculate the weight of the attack features. It effectively overcomes the defect that the weight of attack features is not considered, which leads to the concentration of measurement results.(2)This method uses process algebra modeling and behavior determination. It effectively completes the preprocessing work before measurement so that only need to measure the APK which is weakly equivalent to the attack behavior. This method makes up for the lack of research on preprocessing in the current measurement method.(3)The effectiveness of this method is verified by the experimental results of attack cases, benchmark test sets, and the actual APK test set. Compared with reference [45], the distribution of measurement value of this method is reasonable, which can more clearly indicate the risk degree of APK.

The contribution of this paper can be divided into two aspects: on the one hand, it improves the attack behavior feature set and uses AHP to calculate the feature weight, which solves the problem of incomplete analysis and no weight of the features of collusion privilege escalation attacks; on the other hand, on the premise of using process algebra to decision the behavior of APK, APK which does not need to be measured is filtered out, and a measurement function which fully considers the feature weight is constructed. The method ensures the rationality distribution of measurement result and the consistency of risk measurement value and actual risk.

In the future research, an interesting direction is to apply the theories and methods of multiattribute decision making and fuzzy AHP to our research. In addition, we can continue to refine the features in AF and construct a more perfect measurement function so as to further improve the applicability of the method.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was partly financially supported through grants from the National Natural Science Foundation of China (no. 61772450), University Science and Technology Research Project of Hebei Province (no. ZD2018219), Key R&D Projects of Hebei Province (no. 20375001D), Hebei University Humanities and Social Sciences Research Youth Top Talent Project (no. BJ2020064), and Marine Science Research Project of Hebei Normal University of Science and Technology (no. 2018HY013).