Abstract

To tackle the issue in deep crowd sensing, a Time and Location Correlation Incentive (TLCI) scheme is proposed for deep data gathering in crowdsourcing networks. In TLCI scheme, a metric named “Quality of Information Satisfaction Degree” (QoISD) is to quantify how much collected sensing data can satisfy the application’s QoI requirements mainly in terms of data quantity and data coverage. Two incentive algorithms are proposed to satisfy QoISD with different view. The first algorithm is to ensure that the application gets the specified sensing data to maximize the QoISD. Thus, in the first incentive algorithm, the reward for data sensing is to maximize the QoISD. The second algorithm is to minimize the cost of the system while meeting the sensing data requirement and maximizing the QoISD. Thus, in the second incentive algorithm, the reward for data sensing is to maximize the QoISD per unit of reward. Finally, we compare our proposed scheme with existing schemes via extensive simulations. Extensive simulation results well justify the effectiveness of our scheme. The QoISD can be optimized by 81.92%, and the total cost can be reduced by 31.38%.

1. Introduction

Internet of Things (IoT) [15] and cloud computing [613] take advantage of the ubiquity of smart sensor-equipped devices such as smartphones, iPad, and vehicle sensor devices All of these are to collect information with low cost and provide a new paradigm for solving the complex data sensing based applications from the significant demands of critical infrastructure such as surveillance systems [1416], remote patient care systems in healthcare, intelligent traffic management, and automated vehicles in transportation environmental [1719] and weather monitoring systems. In such applications, due to the need to collect a wide range and large amount of data even in a long-term continuous period, the cost of traditional method deploying sensing devices or specialized employee to collect the data is so high limiting the development of applications [20, 21]. Hence, a new method of data collection named crowd sensing (or participatory sensing) is adopted to collect data [2126]. First, there are a large number of users who can participate in data collection. Because of the development of electrical devices, for example, the smartphones have been very popular in the past few years [27]. In the final quarter of 2010, it is the first time that smartphone sales passed PCs [28]. Smartphones are equipped with rich sensors which are cheap but useful, such as accelerometer, digital compass, GPS, microphone, gyroscope, and camera. These sensors can monitor a diverse range of human activities and collect large amounts of useful information [23, 28]. These can sense data through their smartphones or smart sensing devices which are collectively referred to as users. Second, the data in this way to collect has a broad range and at a low cost, which can provide effective data for many applications. This data collection is generally carried out as shown in Figure 1. In such a crowdsourcing sensing system, there are four main components. (a) users (also known as crowdsourcing, or participants): they represent the smart sensing devices or people who hold sensing devices, which are to sense data samples, report to applications, and gain the reward, also known as reporter; (b) applications (or task publisher): application is the demander of data, which publishes the requirements for data and pays a certain reward to users, motivating the users to collect sensing data, and after users collect data and report to applications, applications provide the customers for advanced services based on the collected sensing data and charge a certain fee to make up for the data contribution; (c) customers: the people who use the application’s services and pay a certain fee to applications; (d) the third-party process platform: the interaction between users, applications, and customers is not certainly direct and may be carried out through the third-party trusted platform and the third-party trusted interactive platform helping the interaction between the parties becomes more effective and convenient and it has been widely used in many researches and practices. The main body of this paper is the incentive mechanism and interaction between users and applications [23]. The interactive process of crowdsourcing sensing system is shown as follows: the applications publish the attributes, such as area, time period, storage quality, and the reward for reporting data samples [22]. The users in target area collect the sensing data, report the sensed data to applications, and gain the reward. Applications process the collected data, form the services, and provide the customers for services. Because a large number of users are involved in collecting the sensing data, it is fast and of low cost and high quality to complete the applications which cannot be achieved in the past. For example, in the wide-area bird ecology system, in order to obtain the biological habits of the entire bird ecosystem, migration, and the relationship between the food chains, a large amount of data must be collected in a wide location and for long time to provide effective service. In these large data services, due to the large amount and range, long time period, and even high data acquisition frequency of data collection, the cost is very high if the applications deploy and maintain the data acquisition equipment (or employee) by itself. Thus, after the applications published needed data elements of bird data samples, time, locations, and the reward for the data, a large number of users volunteered to collect the data and report to applications, which can form the services which are of low cost and high quality and have rich sampling data compared to other methods.

It is foreseeable that, with the development of smart sensing devices, the sensing devices will become more powerful and have a wider sensing scope. And crowd sensing based applications will play an increasingly important role in every aspect of mankind in the future, so as to provide more convenient and better user experience for human. However, it depends on the high-quality data samples that the crowd sensing processes the data, provides customers for advanced services, and gets the payoff which can maintain a positive input to the data collection to ensure the sustainable development of applications. For data collection, it is necessary that rewards are introduced into the crowdsourcing sensing system because it is possible that participating in a sensing task may incur monetary costs, the usage of network bandwidth, and the consumption of the power of the smartphones. Thus, the rewards could motivate them to tolerate these costs and contribute to the sensing task. The smartphones are personal which are not controlled by others. We cannot decide when and how to collect sensor data and it is necessary that rewards can be used to affect their behavior. As a general incentive mechanism, the users who participate in sensing will get monetary reward since they provide the sensed data.

There are many existing studies [2932] that focus on the incentive mechanism. This paper summarizes the existing research into the following categories.

The incentive mechanisms which are for the purpose of certain number of data samples: the main objective of this type of incentive mechanism is to obtain a sufficient number of data samples. There are many ways in such strategy, such as game based mechanism and demand and supply model based market incentive mechanism.

However, in this type of study, the main consideration is to collect a sufficient number of data samples with the least cost, but without considering or rarely considering the quality of data. It obviously no longer meets the current rapid development of the application demands.

Incentive mechanism based on quality: this kind of research not only considers the quantity and cost of data collection but also takes quality of data collection into account. For example, Tham and Luo [30] propose a metric called Quality of Contributed Service (QCS), and the metric named Quality of Information (QoI) is also proposed by Reddy et al. [31] to evaluate the quality of the data samples. This type of incentive mechanism achieves efficient data collection through selecting high QCS reporters. Data coverage is also studied by Song et al. [32], which recruits the most matching participants to maximize coverage. But the actual performance of these strategies is not stratifying due to three reasons as follows. First of all, these studies tend to think that the participants can be controlled by the applications so that the applications can select or recruit the certain participants to meet the QoS, QoI, or coverage of applications. However, whether users participate in sensing task depends on themselves in practice. Hence, the participants cannot be chosen by the application. Secondly, it always assumes that the number of users involved in sensing data is very large in these applications so that the applications could only consider how to select the certain users meeting the needs of application, which is impossible in many cases. Even if in the downtown area where there are rich sensing devices, if there is no sufficient incentive ability, the users who can participate in sensing data are so few that there are little participants which can be chosen by applications. In this case, the focus of the study is to study how to make a suitable reward to motivate users to participate in sensing task rather than considering how to select certain users. Last but not least, in these studies, even if selecting the data with Quality of Information (QoI) in high priority, it cannot obtain high-quality applications from the overall perspective. The reason is that the quality of data is high from the partial perspective, but, in the overall perspective, there is no improvement on quality of applications while collecting data samples in the area where the collected data is enough, even if the data is of high quality. At the time, the data in the area where the collected data is very few can efficiently improve the quality of applications, even if the data is of low quality. Therefore, there is local optimization but no overall optimization if data collection only depends on QoI.

In deep crowd sensing (DCS), there are more detailed requests for the data collection [3338]. First of all, in this aspect of data quality, the demand of sensing data in target area is , which means that the application is of high quality if the collected data is not less than . Thus, the extra sensing data that is more than is redundant. Hence, the system should collect the certain amount of sensor data meeting the demands and it is important that the collected data have a well coverage while the total amount of the collected data is . Coverage generally refers to the case where data is distributed evenly in the data collection range. If the target area of data collection is divided into many subareas and every subarea is not large, the best coverage is that the amount of data collected by each subarea is equal. When the total amount of collected data is certain, if the data collected in some subareas is more than needed, it leads to the lacking of data collected in other subareas. As a result, the QoI is low in the subareas that the collected data is less than needed. As the task mentioned above for monitoring bird migration, data collection is needed in specified monitoring area. If there is no collected data in some subareas, it means that the monitoring of bird migration is not comprehensive. And as the monitoring of haze, if there is no collected data in some subareas, it leads to the disappearance of these subareas of haze in the data report.

On the other hand, the data demand of system is changing over the time. In order to reduce the cost of data collection, making sure the data demand is to ensure that the collected data can reflect the actual situation. Thus, the data demand is often changing in different cases. As the haze monitoring mentioned above, if the weather is stable, the haze of the whole city is stable, and the decrease of actual total collected data has little impact on the high-quality services provided by applications, but it can reduce a lot of cost. And in the case of large changes in the weather, it needs data collection with high frequency and large amount of data to ensure that applications meet the requirements of customers. Obviously, this issue also exists in other applications and it lacks the consideration of these cases in previous studies.

For users, it is different between the users in different subareas, which seriously affect the effectiveness of previous studies. Specifically, there are significant differences in the number of users in different subareas, which leads to a serious shortage of incentive mechanisms that provide all users with the same incentive reward. The main reason is two points as follows: (a) the impact on application quality. Generally speaking, the probability of users involved sensing data is higher when gaining more reward, and, on the contrast, it is the same. Thus, in the case of gaining the same reward, the less subareas where the users are, the less available data samples are, which cannot meet the requirements of applications. And data samples in the subareas where there are more users are so many leading to the data redundancy and it even seriously affects the quality of services provided by applications, which has a negative impact on the development of applications. (b) Because the cost in different subareas is different, it has more damage to the quality of services provided by applications. In practice, the cost in different subareas is different. The general rule is that the cost of sensing data is low in the subareas where there are adequate infrastructures because the convenience of communication bandwidth, net speed, communication costs, and physical traffic reduce the cost of system in these subareas. On the contrary, at the edge of the subareas where network, communication, network, and traffic are inconvenient, the cost of system is high. At the time, combined with the previous reasons, the number of users is large in the low-cost subareas. Thus, in the same incentive reward strategy, it will result in the rich is the richer, and the poor is the poorer, which further deteriorates the performance of the applications. It can be seen that the previous studies cannot solve the problem better, which results in deep crowd sensing.

To tackle those issues in deep crowd sensing, a Time and Location Correlation Incentive (TLCI) scheme is proposed for deep data gathering in crowdsourcing networks in this paper. The goal of TLCI scheme is to collect the appropriate data and to satisfy the Quality of Information (QoI) requirements of applications with minimum budget as much as possible. In TLCI scheme, the data sensing is fine-grained and the reward for data samples is dynamically changing based on the time and location of data sensing, which can effectively improve the system performance. The main contributions of this paper are summarized as follows.

A Time and Location Correlation Incentive (TLCI) scheme is proposed for deep data gathering in crowdsourcing networks. The TLCI scheme is a fine-grained data gathering mechanism which is more efficient than previous strategies to provide high-quality data gathering with lower cost, because our scheme takes the number of users in different time and locations, the cost for sensing data, and the willingness of participating in sensing data into consideration.

A concept named “QoI satisfaction degree” (QoISD) is introduced in this paper to quantify the degree of how collected sensing data can satisfy whole requirements of applications mainly in terms of data quantity and data coverage. The applications of our proposed algorithm show “QoI satisfaction degree” is more effective in evaluating the overall performance contribution of the data samples to the system.

Two incentive algorithms are proposed to satisfy QoISD with different view. The first algorithm is to ensure that the application gets the specified sensing data to maximize the QoISD. So in the first incentive algorithm, the reward for data sensing is to maximize the QoISD. The second algorithm is to minimize the cost of the system while meeting the sensing data requirement and maximizing the QoISD. So the reward for data sensing is to maximize the QoISD per unit of reward in the second incentive algorithm.

Finally, we compare our proposed scheme with existing schemes via extensive simulations. Experimental results show that our distributed incentive mechanism can successfully attain our aim in this work, which is more suitable in real world. Extensive simulation results well justify the effectiveness of our scheme. The QoISD can be optimized by 81.92%, and the total cost can be reduced by 31.38%.

The rest of the paper is organized as follows. We review related work in Section 2. In Section 3, we describe the system model and formulate the problem of our incentive mechanism. Section 4 present the details of our distributed algorithm for achieving the aim. We evaluate the proposed algorithm via simulations in Section 5. Our paper is concluded in Section 6.

Crowd sensing (CS) refers to sensing a wide range of human activities and their surrounding environment by kinds of sensing devices such as smartphones, iPad, or sensor nodes [32, 3942]. Because the sensing devices is so vast, it is given a new name crowdsourcing networks (CN) [43]. Due to quick development of mobile sensing devices, collecting the data by these devices is so rich, real-time, and low cost that the traditional way of data collection has a fundamental change [2932, 43]. As a result, there is a huge change in the network [43]. The center of the network computing migrates from the network center to the network edge [44], which is from cloud computing to edge computing [44], then to big data network [45]. The root of these technological developments lies in the collection of high-quality data [2932]. Thus, how to obtain high-quality data becomes the focus of research.

Due to the cost of sensing, process, and report data, such as storage, energy, and communication, the incentive mechanisms are a key part of these systems and have been studied extensively [2932, 43]. The incentive mechanism has been applied to crowd sensing with some specialties that are not available in other applications [43]. First, in CS, the number of reporter is large, and most of them do not have identity authentication features. Thus, it does not have the verification functionality that is available in most interactive systems. In traditional crowdsourcing, such as Amazon Mechanical Turk [44], the crowdsourcing platform publishes their tasks and the reward for subtask (or microtask). The worker is required to submit proof of the task after completing the task and crowdsourcing platform will pay the worker for the promised reward. In this way, through the large number of workers it can achieve the huge task which can be broken down into many microtasks. However, this method is difficult to be implemented in the CS because of the following reason. In CS, the task for data collection is called microtask, in which each data sample is so micro so that if using a complex authentication mechanism, its cost and time consumption are far more than the cost of the task itself. Thus, it is not suitable for CS adopting such certification mechanism. Therefore, an effective incentive mechanism to motivate the users to collect data samples is very important for CS [29].

There are many incentive mechanisms that are designed to attract users to participate in sensing task. This paper divides it into the following categories.

The incentive mechanisms which are for the purpose of certain number of data samples: the incentive mechanism is to motivate the users to collect the data through adjusting the reward. The general process is as follows: we assume the amount of data expected to be collected is . It will decrease the reward for each data sample to reduce the cost of applications while the actual collected data is more than the expected. On the contrary, if the current collected data is less than , then the current incentive reward is not enough and the enthusiasm of the users to sense data is not high. At the time, the system improves the enthusiasm and makes the collected data reach through increasing the reward for data samples. There are many differences for the specific applications. A game based on services price decision (GSPD) model to depict the process of price competition between Services Organizers (SOs, i.e., applications) and entities (users) as well as internal entities (users), which leads to a Pareto-optimal equilibrium point. The auction incentive mechanism is another incentive mechanism which is applied in CS. A reverse auction-based dynamic price (RADP) incentive mechanism was proposed by Lee and Hoh [46]. And a subset of users with lowest reward (prices) are chosen by the service provider that needs sensing data, and their sensor data are purchased at their bid prices. It shows that the dynamic price incentive mechanism can reduce the incentive cost compared with the fixed price. Another reverse auction-based incentive mechanism was designed by Jaimes et al. [47] which considers the locations of the users, the budget constraints, and the sensed coverage. Their incentive schemes can improve the covered area. The Stackelberg game joint auction-based mechanism is also proposed by Yang et al. [48].

Incentive mechanism based on quality: this kind of research not only considers the quantity and cost of data collection but also takes quality of data collection into account. In this type of research, some evaluation norms of quality of data samples are proposed to evaluate the quality of data samples. In some studies to collect data samples for service composition, the concept of Quality of Contributed Service (QCS) is proposed by Tham and Luo [30] to measure the contribution of collected data to the combined service. In addition, in some applications, such as noise mapping, traffic condition reporting, and environmental impact monitoring, the metric named Quality of Information (QoI) is proposed by Wang et al. [29] to evaluate the quality of the data samples. In such studies, it is generally considered that the number of participants involved sensing data is very large. Hence, the goal of the incentive strategy design is to select the participants with highest level of performance assessment (e.g., QCS and QoI) from these users. At the time, these studies focus on minimizing the overall cost, profit, or energy consumption while selecting the high-quality participants, shown by Liu et al. [4952].

In addition to the above two types of research, recent studies think that only the reward based incentive mechanism does not necessarily achieve good results. One important reason for this is as follows: these studies did not consider the users and collected data credibility. Thus, the recent researchers have proposed a comprehensive consideration of the QoI of data and the collected data credibility. These studies think that the users’ reputation represents the collected data credibility. Therefore, the applications selecting users with high reputation on priority can improve the performance of the system. Such a reputation-based incentive mechanism can be found in Zhang and Van Der Schaar [53] as well as Wang et al. [54].

3. The System Model and Problem Statement

3.1. System Model

It is considered that a crowdsourcing sensor data gathering system is made up of a platform in the cloud and a large number of smartphone users connecting to the platform. The applications (data requesters) publish the task on the platform to collect the sensor data. In addition, the smartphone users who are possibly to participate in the sensing task are denoted as follows:

The participants represent the users who are satisfied with reward for their sensed data contribution and participating in the sensing task. So the set is the subset of the set .

The participants take a cost when they use their smartphone to collect the data, such as the electricity consumption when opening the sensor or their consumed time, which have an impact on whether to participate in task. In order to formulate the users’ behavior in the reality, it is necessary to stimulate the user’s cost and the probability of participating in sensing task.

Definition 1 (participants cost). The sensing time represents the consumed time of participants for participating in the sensing task. In the sensing time, participants collect data continuously. However, the cost is different in different circumstance and different time. The cost of the users is involved in the consumption of time and power of phone. For example, the cost is simpler in cities than mountains and daytime in the night. And the cost is different in the same location and time even if the sensing time is same. It is mainly because the difference of cost of consumption of phone power is due to the difference of smartphones and sensors. Thus, the cost function of participants is involved in the time and sensing circumstance.

We assume that we need the sensor data in a place and in a time period . If the sensed data is evenly distributed on the target area and time, it makes sensed data representative and is of high quality. The targeted area and the task’s period will be divided equally, such as , , which form many grids. The size of the grid is related to applications. In general, the number of grids should be enough, and the grid area should be small enough. In this way, the collected data from participants could represent the data situation for the grid. represents a short period of time during which the monitored value of the object is stationary. The data collected by sensor nodes during this time period can represent the physical values for this time period. The granularity of time division is also related to applications. If the application changes substantially, then it requires that is small. And if the application changes within a narrow range, then can take a larger value. For example, in the case of smooth weather, the temperature in an hour has almost no change; then the time granularity of sensing temperature can take a larger value. In this period of time, any data collected during this time period can represent the temperature during this time period. And when the weather changes substantially, should take a smaller value so that the temperature value sensed in a very short time period is basically equal and can represent the temperature in this time period. If taking a larger value, the difference of the sensed value is very large in different time. Thus, the sensed value in different time cannot represent the temperature in the time period.

And we use to represent the cost in subperiod and in subarea caused by circumstance involved in location and time. represents the cost matrix in the entire sensing time and area.

We assume that the cost involved in power consumption has a linear relationship with the sensing time, denoted as , and is sensing time. In addition, is user parameter, which depends on the phone of participant and different between different users. Hence, the cost function of sensing time for participant is formulated as

Definition 2 (the probability of participation). The probability of participation of user represents the probability that the user participates in the sensing task when the reward is for the sensing time . We assume that the users are rational and strategic so that whether they participate in the task depends on the reward and cost. The probability of participation is increasing as the ratio of economic benefits increases. We use the power function of ratio of economic benefits to represent the probability that user participate in data collection when the reward is , which is shown in the following:And is the parameter to be measured.

The set of users in subarea and subperiod is denoted as follows:where = is the smartphone user. And represent the set of users in the subperiod and subarea . In addition, we use to represent the number of users in the subperiod and subarea .

in (6) represent the set of users in the subperiod and subarea . But all users in a grid not necessarily participate in sensing task. Hence, we use to represent the users who actually participate in the sensing as follows:where = is the participants. represent the set of participants in the subperiod and subarea . In addition, we use to represent the number of participants in the subperiod and subarea .

We assume the reward provided by applications is the metric shown inwhere represent the reward of participants in the subperiod and subarea .

Equation (6) shows the users situation in the subperiod and subarea . But all users in a grid not necessarily participate in sensing task. In the incentive reward mechanism shown in (8), according to (5), the evaluated participation situation is shown inwhere .

The amount of data expected to collect data is shown inwhere

The actual amount of data obtained is shown inwhere

Definition 3 (QoI satisfaction degree, QoISD). In the crowdsourcing gathering system, the data requester (i.e., applications) publishes the task on the platform to collect the sensor data. Thus, the quality of sensed is very important for data requester. In order to make the sensed data of high quality, it is obvious that the data should have a good distribution on locations and time so that the data is more representative.

In this paper, a metric named “QoI satisfaction degree” (QoISD) is to quantify how much collected sensing data can satisfy the application’s QoI requirements mainly in terms of data coverage.

We assume that the data demand of application is shown in (12). If the actual data is exactly the same as the data demand , it shows that the incentive mechanism is efficient, which means there is no redundant data. It is the best situation, and, at the time, the QoISD is highest.

However, in practice, the incentive mechanism is difficult to make . Therefore, in this paper, we use in (13) to represent the difference between the collected data and the data demand, which is QoISD. Obviously, when QoISD , it is the best situation.

If the application requires the same amount of data for each grid, which is a constant , then (13) is transformed into the following:

3.2. Problem Statement

(1) Sensor Data to Meet the Requirement. Sensing data to meet the applications requirement contains two aspects of meaning: (a) considering the total data requirement is in subperiod . Therefore, the total amount of data collected in the target area must be greater than , as in (16):

(b) The amount of data collected for each grid should be greater than the amount of data required for application, which is shown in

(2) Maximize the Quality of Data. As mentioned above, in order to get high quality of sensor data, we should make the data’s distribution best based on time and locations, which means optimizing the QoISD, as follows:

(3) Minimize the Cost for Collection Data. The aim of incentive mechanism is to minimize the cost of system and optimize the quality of the collected data. Therefore, an important goal of Time and Location Correlation Incentive (TLCI) scheme is to minimize the cost for data collection.

Thus, the overall goal of TLCI scheme is as follows:

4. Incentive Mechanism Design

In order to state the parameter of this paper clearly, the main notions introduced in this paper can be found in Parameters’ Description.

As shown in Figure 2, we propose an incentive mechanism in this work to get enough sensor data which is to optimize the QoISD. In our incentive mechanism, the platform sends the task’s information to every user, and whether users participate in the sensing task depends on their own cost and the published reward. We assume that the total lasting time period of sensing task is divided into some time periods. And in every subperiod, the target area where data is expected to be obtained is divided into some subareas and the platform publishes a new reward for participants’ contribution, which is different for participants in different subareas to optimize the QoISD. It is repeated until the sensing task is completed.

In the incentive mechanism, we use two reward allocation algorithms to optimize the QoISD and reduce the cost of system and both of the algorithms are based on greedy strategies. One algorithm is to solve the problem of optimizing the QoISD when attaining the task’s requirement. The other algorithm is improved on the basis of the first algorithm, which is to take the less cost on rewarding the participants and has a better speed than the former. The two incentive algorithms are discussed as follows.

4.1. Distributed Algorithm for Optimizing QoISD

The aim in this work is to find a better way of reward allocation so that the system could gain enough high-quality sensed data. The aim in first algorithm is to optimize the QoISD while meeting data requirements that the sensed data in subperiod reach the data demand , which is formulated as

The algorithm mainly motivates the smartphone users to participate in sensing task through adjusting the reward allocation for users in every subarea . We first analyze the optimal results of sensed data. If the total amount of sensor data needed for applications is , and the amount of sensed data in every subarea is equal which is formulated as , it requires . In this case, the optimal incentive mechanism is to adjust the reward vector so that making the amount of sensed data in every subarea is exact and the value of QoISD is zero which is optimal. Similarly, if there is no requirement that , and the data demand is , we assume the actual sensed data is . The optimal incentive strategy is to adjust the reward vector to optimize the QoISD the same as above. Obviously, it should increase the reward in the subarea where sensed data is less than demand and the difference of both is rather large to motivate the users’ participation and makes the QoISD optimal. On the contrary, it optimizes the QoISD decreasing the reward in the subarea where sensed data is more than demand. As shown in (21), represents that deviation rate between actual sensed data and demand in subarea and subperiod .

Obviously, when is positive it means the actual sensed data is less than demand and it needs to raise the reward . In addition, the larger , the more amount the lacked data and the increased reward . On the contrary, when is negative, it means the actual sensed data is more than demand and it should decrease the reward to reduce the cost of the system.

Our incentive algorithm is to find the optimal reward allocation for every subarea. The mechanism algorithm iteratively raises the reward to the users in the subarea that could optimize QoISD until the sensed data is enough to the sensing system’s demand and the QoISD is optimal. The detailed descriptions are provided as follows.

Step 1. First, according the information of sensing task, we initialize the reward of every subarea to and calculate the new amount of sensor data .

Step 2. Second, we iteratively adjust the reward vector to optimize the QoISD . In each iteration, the algorithm adjusts the reward in a subarea which makes QoISD optimal. We use to mark the target subarea and to mark the increased reward, which is negative when the actual sensed data in target area is more than demand. In addition, represents the optimal QoISD currently and the algorithm updates the value of in every iteration. Hence, in the end of every iteration, it updates the reward in target subarea and the amount of sensed data. In addition, the algorithm ignores the subarea where there is no increased amount of sensor data, as shown in Steps -.

Step 3. The sensed data is updated in every iteration, and the algorithm ends when reaches the data demand . In the end of this incentive algorithm, it returns the reward vector, which makes the QoISD optimal.

The proposed algorithm is given in Algorithm 1.

Input: The data demand ;
Output: The new vector of reward .
   For each
     =   // to initialize the reward of every sub-area to
   End for
    
   While Do
      ;
    // is the QoISD before the update
      //to mark the selected grid which is not determined
        //to store the minimum QoISD
     For = 1 to Do //each grid
      If < 0
       
        //to increase the reward when the sensed data doses not meet the demand
      Else
       
      //to decrease the reward when the sensed data doses not meet the demand
      End if
      
      //the sensed data in grid after updating the reward
      
      
      // is the updated QoISD
      If   //if there is no improved QoISD
        Goto Step
      End if
      If
       
       ; //to select the that makes QoISD optimal
       
  (26)    End if
  (27)   End for
  (28)   
  (29)     //to update the sensed data
  (30) End while
  (31) Return the new reward
4.2. Distributed Algorithm for Optimizing the QoISD and Reducing the Cost

Based on Algorithm 1, in this subsection, we propose another algorithm for taking the platform’s cost and QoISD into account. The aim in Algorithm 2 is to optimize the QoISD and reduce the cost of system while meeting data requirements that the sensed data within subperiod reaches the data demand , which is shown in

Input: The data demand ;
Output: The new vector of reward .
   For each
        // to initialize the reward of every sub-area to
   End for
    
   While Do
     ;
    // is the QoISD before the update
       //to mark the selected grid which is not determined
        //to store the minimum QoISD
     For = 1 to Do //each grid
      If
         
       //to increase the reward when the sensed data doses not meet the demand
      Else
       
      //to decrease the reward when the sensed data doses not meet the demand
      End if
      
      //the sensed data in grid after updating the reward
      
      
      // is the updated QoISD
      If   //if there is no improved QoISD
        Goto Step
      End if
      If
       
       ; //to select the that make QoISD optimal
       
  (26)    End if
  (27)   End for
  (28)   
  (29)     //to update the sensed data
  (30) End while
  (31) Return the new reward

It is apparent that we could solve it by weighted sum method since it is a multiobjective optimization problem. By selecting scalar weights parameters and for and , the target function is shown in the following:where . And we could adjust the value of to achieve our purpose.

The idea of Algorithm 1 is to raise the reward to the users in the subarea that could optimize QoISD iteratively until the sensed data is enough to the sensing system’s demand and the QoISD is optimal. In order to reduce the cost of system, the reward should be allocated to the cost-effective subarea, where system could gain more sensor data with low cost. For each subarea , we use the to represent how cost-effective it is, which is the ratio of increased data to increased cost and formulated as

Hence, there is a main alternation in Algorithm 1:

The proposed algorithm is given in Algorithm 2.

5. Performance Analysis and Optimization

5.1. Methodology and Setup

In this section, we compare our two algorithms, Time and Location Correlation Incentive Algorithm 1 (TLCI1) and Time and Location Correlation Incentive Algorithm 2 (TLCI2) with the pricing based fixed price algorithm (PPA). The main character of this algorithm (PPA) is that the participants get the same reward for the same amount of sensor data. We assume the smartphone users are rational and strategic so that whether they participate in sensing task depends on their cost and reward. In order to show how they are similar or different, the parameters of these are the same.

We first study the situation of sense data and the cost of system for rewarding the participants at different locations in 5.2. Then in 5.3, we study the sensed data and the cost in different time in two situations and make a performance comparison of the three algorithms. In addition to the performance comparison based on different time and locations, we also study the quality of sensed data and the stability of algorithms in 5.4 and 5.5, respectively.

The default values of all parameters in our model are set as follows. The default units of reward and sensing time are dollar and minute, respectively. And the user-specific parameters are uniformly distributed random values within . In addition, according to the data obtained from the questionnaire survey, the probability parameter is 1.745. All simulations under the same setting are repeated ten times to get the average values.

5.2. The Performance of Sensed Data and Cost at Different Locations

We first study the performance of the three algorithms under different situations. We mainly study two scenarios, the active and the inactive period of users. In the active period, there are large number of users distributed in the target area. On the contrary, fewer people are involved in the sensing task in the inactive period. The study on two scenarios is shown as follows.

5.2.1. The Active Period of Users

In the active period of users, there are large number of users that could participate in the sensing task and have plenty of free time to collect the sensor data. We assume that the platform’s demand is 4000 (Mb) and the number of users of every subarea is uniformly distributed random values within (50, 100). The results are as follows. As shown in Figures 3, 4, and 5, the range of sensed data in the subareas in TLCI1 and TLCI2 are similar and the range of sensed data in our two TLCI algorithms varying from 45 to 55 is less than that in PPA, which is varying from 40 to 75 and has a negative impact on the quality of sensed data. The result shows that the variance of sensed data can be reduced by 68.85% and 52.50% in TLCI1 and TLCI2, respectively. We also stimulate the participants’ reward in Figures 6 and 7 about our two TLCI algorithms. Compared to the same reward for every user in PPA, the reward varying from 3.7 to 7 is different for each user in our two TLCI algorithms.

5.2.2. The Inactive Period of Users

In the inactive period of users, the users do not have enough time to participate in the sensing task. Thus, the number of users is less than the number of users in the active period. We assume that the number of users in every subarea is uniformly distributed random values within . As shown in Figures 8, 9, and 10, we could draw the same conclusion as that in active period. In the inactive period, the variance of sensed data can be reduced by 90.85% and 79.06% in TLCI1 and TLCI2, respectively. In addition, the situation of reward allocation is shown in Figures 11 and 12 that the reward allocation of two algorithms is similar.

5.3. The Performance of Sensed Data and Cost at Different Time

It is considered that the variation of the number of users in the whole day has an objective law, which is shown as Figure 13. The users prefer to participate in the sensing task in the daytime and the number of users in daylight is far less than in daytime. For a more comprehensive consideration, we consider the whole QoISD that involves the data at different times rather than the QoISD at a time period. Therefore, when the impact of incentive reward on the whole QoISD is too little, it will reduce the data demand and increase to other time periods according to the number of users. It is to ensure that the collected data of the whole day meet the demand.

In this subsection, we study the performance of three algorithms in two situations: the random distribution of users and the concentrated distribution of users. In situation of the random distribution of users, the users are evenly distributed at different subareas. We assume that the total users at a time is , and the users at a subarea are uniformly distributed random values within (−30) and (30) when there are subregions at the time. In the concentrated distribution of users, the 50% of users focus on the 30% of central area.

5.3.1. The Random Distribution of Users

In the random distribution of users, the users are evenly distributed at the locations at a time. In this subsection, we stimulate the average reward for users and total reward budget for users’ contribution at different time, as shown in Figures 14 and 16. It is obvious that, compared to fixed reward in PPA, our two TLCI algorithms have a better adaptability, adjusting the user’s reward according to the number of users. The overall trend is that the average of reward is declining with more users as shown in Figure 14, which can be interpreted as the payment in exchange for participation. As a result, the number of participants and the total reward at different time are shown in Figures 15 and 16, in which total reward and total participants grow up in number when there are few users so that it can collect more sensor data. In addition, we can see the total cost of system is reduced by 19.97% and 42.79% in TCLI1 and TCLI2, respectively. Figure 17 shows the cumulative total reward over the time, which proves the effectiveness of our two algorithms and the TLCI2 could reduce the cost. In addition to the total reward, we also study the sensed data over the time in Figure 18. Figure 19 shows that the redundancy of sensed data over time that the percentage of the difference between actual and expected sensed data to expected data. It is similar to the sensed data over time. And in Figure 20, we calculate the cumulative sensed data over time and we can see the sensed data in PPA exceeds the demand. It shows the PPA algorithms have a bad adaptability compared with TCLI algorithms.

5.3.2. The Concentrated Distribution of Users

In the situation of the concentrated distribution of users, the 50% of users focus on the 30% of central area. In this subsection, we stimulate the average reward of users and total reward budget for users’ contribution at different time, as shown in Figures 21 and 23. It is obvious that compared to fixed reward in PPA, our two TLCI algorithms have a better adaptability, adjusting the user’s reward according to the number of users. The overall trend is that the average of reward is declining with the more users as shown in Figure 21, which can be interpreted as the payment in exchange for participation. As a result, the number of participants and the total reward at different time are shown in Figures 22 and 23, in which total reward and total participants grow up in number when there are few users so that it can collect more sensor data. In addition, we can see the total cost of system is reduced by 34.20% and 14.89% in TCLI1 and TCLI2, respectively. Figure 24 shows the cumulative total reward over the time, which proves the effectiveness of our two algorithms and the TLCI2 could reduce the cost. In addition to the total reward, we also study the sensed data over the time in Figure 25. Figure 26 shows that the redundancy of sensed data over time that the percentage of the difference between actual and expected sensed data to expected data. It is similar to the sensed data over time. And in Figure 27, we calculate the cumulative sensed data over time and we can see the sensed data in PPA exceeds the demand because the fixed reward in PPA has a bad adaptability compared with our two TLCI algorithms.

5.3.3. The Comparison of Three Algorithms

In this subsection, we compare the performance of three algorithms in two situations mentioned above. The results are shown as follows.

We use situation A and situation B to represent the situation of the random distribution and the concentrated distribution, respectively. We first calculate the total reward and total sensed data in the whole day and make a comparison in Figures 28 and 29, respectively. According to Figure 28, we can see that our two TLCI algorithms can reduce the cost for rewarding the participants compared with fixed reward and TLCI2 has a better performance. On average of cost in two TCLI algorithms, it can reduce the cost of system by 37.07% in situation A and 24.54% in situation B, respectively. And Figure 30 shows that the TLCI algorithms can make the cost valuable rather than collecting redundant data.

Then, we make a comparison of total redundancy of sensed data of whole day in two situations and the result is shown in Figure 30. It shows the effectiveness of our two TLCI algorithms and the TLCI1 has a better performance compared with TLCI1. In addition, we make a comparison of performance on optimizing the QoISD in three algorithms, and the result is shown in Figure 31. On the average of QoISD in two situations, the QoISD can be optimized by 81.92%, and the total cost can be reduced by 31.38% with considering two TLCI algorithms.

5.4. The Quality of Sensed Data

In this subsection, we use the difference between the amount of actual and expected sensed data to evaluate the performance of our two algorithms (TLCI) and PPA. As above, we evaluate it in two situations, the active and inactive period of users, and the results are shown as follows. Figures 32, 33, and 34 show the difference between actual and expected sensor data in three algorithms in active period. According to the figures, we can see that the difference between actual and expected sensed data in two TLCI algorithms is less than that in PPA algorithm and the performance of two TLCI algorithms is similar, where the difference is varying from −2.1 to 2.45 in TCLI1, varying from −3.34 to 3.12 in TCLI2, and varying from −9.12 to 9.27 in PPA. And Figures 35, 36, and 37 show the same conclusion as above. At last in this subsection, we calculate the variance of sensed data at different time in the situation of the distribution of users, which is shown in Figure 38. The variance of sensed data in two TLCI algorithms is less than that in PPA algorithms, where two TCLI algorithms have the similar performance on optimizing the quality of data. We can see the advantage of our two TLCI algorithms, making the sensed data evenly distributed at different locations.

5.5. The Stability of TLCI Algorithms

In this subsection, we mainly study the stability of TLCI algorithms. We first study the number of iterations by changing the value of . The result is shown in Figure 39, and we can see the number of iterations is declining with the increasing in TLCI2, which proves the better performance in speed in TLCI2 compared with TLCI1. Then we study the stability of TLCI algorithms. It is known that our algorithms will reach a steady state after limited number of iterations.

In Figures 40 and 41, we compare the total reward and the sensed data over the iteration in our two algorithms and it proves that the TLCI2 has the better performance in terms of speed of algorithm and cost reduction, respectively. And we can see that the total cost of system after stabilization in TCLI2 is lower than that in TCLI1. In the end of this subsection, we calculate the variance of sensed data over the iteration as shown in Figure 42. The variance of sensed data at different subregions will stabilize after limited iterations and the stable variance of sensed data in TLCI2 is less than that in TLCI1, which means the same conclusion as the previous experiment.

6. Conclusion

This paper has focused on the problem of optimizing the quality of sensor data and collecting enough sensor data in a sensor data gathering system. In such a system, the cost of each user depends on his circumstance, which has an impact on users whether to participate in the sensing task. The platform makes different reward for task participants in different region and time period so that it could make the distribution of sensed data on location and time best. Such a sensing system has been developed to realize such applications, such as NoiseTube [55].

In this paper, we have proposed two effective incentive mechanism algorithms for motivating smartphone users to participate in the smartphone sensing. Both of the algorithms are based on greedy strategies. One algorithm is to solve the problem of optimizing the QoISD. The other algorithm is improved on the basis of the first algorithm, which is to take the less cost on rewarding the participants. Extensive simulations have been performed and the results have confirmed that the two algorithms can achieve our goal and the latter has a better speed than the former. QoISD can be optimized by 81.92%, and the total cost can be reduced by 31.38%.

Parameters’ Description

,:The parameter in the cost
:The smartphone users
:The participants in sensing task
:The cost of user when sensing time is
:The reward of participant
:The participation probability of a user participating in sensing task
:The number of smartphone users in subperiod and subarea
:The number of task participants in subperiod and subarea
:The reward for every participant in subperiod and subarea
:The number of participants in subperiod and subarea based on probability estimate
:The estimated amount of sensor data in subperiod and subarea
:The actual amount of sensor data in subperiod and subarea
:The actual demand of sensor data in subperiod and subarea
:QoISD in subperiod
:The cost of system for rewarding participants in subperiod
:The total demand of sensor data in subperiod
:The parameter in the target function
:The cost-efficiency of subarea .

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (61772554, 61370229, 61572526, and 61572528), the National Basic Research Program of China (973 Program) (2014CB046305), the Science and Technology Projects of Guangdong Province, China (2016B010109008 and 2016B030305004), and the Science and Technology Projects of Guangzhou Municipality, China (201604010003 and 201604010054).