- About this Journal ·
- Abstracting and Indexing ·
- Aims and Scope ·
- Annual Issues ·
- Article Processing Charges ·
- Author Guidelines ·
- Bibliographic Information ·
- Citations to this Journal ·
- Contact Information ·
- Editorial Board ·
- Editorial Workflow ·
- Free eTOC Alerts ·
- Publication Ethics ·
- Recently Accepted Articles ·
- Reviewers Acknowledgment ·
- Submit a Manuscript ·
- Subscription Information ·
- Table of Contents

International Journal of Distributed Sensor Networks

Volume 2013 (2013), Article ID 527965, 14 pages

http://dx.doi.org/10.1155/2013/527965

## Node Selection Algorithms with Data Accuracy Guarantee in Service-Oriented Wireless Sensor Networks

College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350108, China

Received 20 December 2012; Accepted 26 February 2013

Academic Editor: Neal N. Xiong

Copyright © 2013 Hongju Cheng et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

The service-oriented architecture is considered as a new emerging trend for the future of wireless sensor networks in which different types of sensors can be deployed in the same area to support various service requirements. The accuracy of the sensed data is one of the key criterions because it is generally a noisy version of the physical phenomenon. In this paper, we study the node selection problem with data accuracy guarantee in service-oriented wireless sensor networks. We exploit the spatial correlation between the service data and aim at selecting minimum number of nodes to provide services with data accuracy guaranteed. Firstly, we have formulated this problem into an integer nonlinear programming problem to illustrate its NP-hard property. Secondarily, we have proposed two heuristic algorithms, namely, Separate Selection Algorithm (SSA) and Combined Selection Algorithm (CSA). The SSA is designed to select nodes for each service in a separate way, and the CSA is designed to select nodes according to their contribution increment. Finally, we compare the performance of the proposed algorithms with extended simulations. The results show that CSA has better performance compared with SSA.

#### 1. Introduction

Sensing is considered as one of the most important technologies especially in the emerging big data era. Nodes with sensing ability can be deployed everywhere in the world including airspace, ground, and underwater environment due to their cheapness, simplicity, and small size. Moreover, the wireless radio allows these sensor nodes to be organized into a network, which is generally named as Wireless Sensor Networks (WSN) [1], and local information about the environment is then sensed and reported to the base station in a periodical manner. It is obvious that the wireless sensor networks will create a huge number of data with time ongoing and the number of network increasing. Accordingly, one challenging issue is how to utilize these various wireless sensor networks in the future big data era.

The current wireless sensor networks are generally data-centric or application-centric, which means that each sensor node serves for one special application. However, with the number of different applications increasing rapidly, heterogeneous wireless sensor networks appear and they might be located in the same physical areas providing different data-collection functions. How to connect these heterogeneous wireless sensor networks efficiently is still a pioneering work in the future ubiquitous computing environment due to several observations. Firstly, two different applications may be interested in the same collected data, and it is unnecessary to place two separate nodes with identical sensing devices but different tasks. Secondary, in case that one application is concerned with several different types of sensed data simultaneously, such a requirement is not guaranteed since current solutions work only in a separate way. Finally, the emergence of powerful sensors, which can provide different types of data sensing, has introduced new issues in the research of wireless sensor networks since they can support different applications simultaneously.

Accordingly, the service-oriented architecture appears as a new emerging trend for the future of wireless sensor networks, in which sensor networks are considered as the services provider and sensors as the data sources for these services [2]. Users and programmers can access the service-oriented wireless sensor networks by using a simple service-oriented interface and utilizing encapsulation of the low-level implement details. Services in wireless sensor networks may be the sensing capabilities for example, temperature and humidity or software components provided by nodes, for example, the operations of in-network data aggregation, time synchronization, and data processing [2–5]. The data sensing of nodes can also be defined as the sensing services, and sensed data as the service data similarly. Furthermore, the sensor nodes can be equipped with multiple types of sensing units used to collect environmental information. For example, the MICA2 mote [6] can provide services such as light, sound, and vibration.

In the heterogeneous applications scenarios, different types of sensors can be deployed in the same area to support various service requirements. Figure 1 shows an example of a service-oriented wireless sensor network including eight sensor nodes. The sensed sources are assumed to be located at positions and , and each sensed target is assumed one different service. Nodes , , and can only support service , and nodes , , and support service . However, nodes and can support service and simultaneously. In this example, there are several ways to select nodes while providing the services and by assuming that at least two nodes are required for each service, that is, nodes , for service and nodes , for , or nodes , for both of them simultaneously. Note that sensor networks are generally deployed in dense manner and a huge number of nodes can be provided to collect data from the environment. It leads to the problem of how to select nodes in an efficient way to provide the required services.

The accuracy of the sensed data is one of the key criterions while choosing nodes to provide required service. It is also known that sensor nodes are generally designed under the guideline of cheapness and simplicity and that they are mostly deployed in a dynamic and rough terrain for continuous environmental monitoring. The sensing equipment on sensor nodes is expected to be unreliable and the collected data to be distorted. Furthermore, some physical attributes exhibit a gradual and continuous variation over the two-dimensional Euclidean space due to the diffusion property and, thus, each observer has different distorted data. To collect all related data around the same sensing sources can help to eliminate or minimize the distortion. However, it might lead to heavy energy consumption in the network. Some applications only are concerned with approximate observations rather than exact results [7, 8], and it is unnecessary to gather data from all nodes in the network in this case. It shall also be mentioned that the diffusion property of physical attributes results in spatial correlation among sensed data observed by nodes closer to the same sensing source. Exploiting the spatial correlation can help to improve the network performance by selecting a subset of nodes to provide the required service with data accuracy guaranteed [9, 10].

Although the service-oriented architecture for the wireless sensor networks has already been introduced recently in related works [4, 5, 11–14], most of them are concerned with the practical architecture framework, such as middleware and platforms. Some works [13, 14] considered the node scheduling to support services query with network lifetime prolonged, and some others are concerned with the spatial correlationship among the sensed data with the service unconsidered [15–17]. To provide accurate service to users is an important issue and it is generally one criterion for the applications. In this paper, we focus on the heterogeneous service supporting problem in the future wireless sensor networks and aim at providing efficient node selection algorithms with data accuracy guarantee for the service-oriented sensor networks. Different from previous works, we consider the data accuracy for services by following the observation that sensed data is a noisy version of the physical phenomenon. And we have explored the data accuracy according to their spatial correlation for the same sensing sources.

Due to the inaccuracy and spatial correlation of the sensed data, it is a new and challenging issue to provide the required services with data accuracy guarantee in an *energy-efficient *way for the service-oriented wireless sensor networks. So far, as we know, this is the first paper concerned with both the data inaccuracy and spatial correlation in the sensor networks, and we aim at providing node selection algorithms so as to improve the network performance. The main contributions of this paper are summarized as follows. We have proposed the node selection problem with data accuracy guarantee for service-oriented wireless sensor networks via bipartite graph and formulated it as an integer nonlinear programming problem to illustrate its NP-hard property. We have also presented two efficient heuristic algorithms for this problem; namely, Separate Selection Algorithm (SSA) and Combined Selection Algorithm (CSA) with low-time complexity.

The rest of this paper is organized as follows. In Section 2, we summarize the related works. Section 3 describes the system model and the problem formulation. Sections 4 and 5 have introduced the integer nonlinear programming formulation and two heuristic selection algorithms. Section 6 describes and analyzes the simulation results. We conclude this paper in Section 7.

#### 2. Related Works

This paper focuses on the node selection problem in service-oriented wireless sensor networks. Several works have been done to develop service-oriented architecture specific to the sensor networks, and the architecture has shown many advantages in the heterogeneous applications scenarios. Gračanin et al. [2] proposed a service-centric model that focuses on services provided by a wireless sensor networks and views a wireless sensor networks as a service provider. This model consists of mission, network, region, sensor, and capability layers. Within each layer, there are four planes or functionality sets: communication, management, application, and generational learning. Rezgui and Eltoweissy [4] introduced the service-oriented architecture as an approach for building a new generation of open, efficient, interoperable, scalable, application-aware sensor-actuator networks. In this vision, sensor-actuator networks would not be deployed to provide sensing and actuation capabilities to a specific application but, rather, to provide sensing and actuation services to any application. King et al. [5] developed a service-oriented sensor and actuator platform called Atlas, which enables self-integrative, programmable pervasive spaces. This platform has shown the advantage of improving communication and interoperability between heterogeneous devices in pervasive computing environments. Authors of [11] proposed TinySOA, a service-oriented architecture that allows programmers to access wireless sensor networks from their applications by using a simple service-oriented API via the language of their choice. The main advantage of TinySOA is relieving application developers from dealing with the low-level technical details of the wireless sensor networks to get sensors data. Corchado et al. [12] proposed a service-oriented telemonitoring system for healthcare using heterogeneous wireless sensor networks, which aimed at improving healthcare and assistance for dependent people.

It is an important issue to provide services efficiently with resource-constrained sensor nodes. Node scheduling is considered as an efficient technique to implement the service-supporting schemes, in which sensor nodes should be selected to provide requested services. Recently, Wang et al. [13] investigated the service-availability-aware sleep scheduling design in service-oriented wireless sensor networks. The purpose of this study is to minimize the energy consumption and guarantee that enough sensors are active to ensure service availability at all times. The authors had proven this problem to be NP-hard and presented heuristic linear-programming based-solutions. However, they assumed that each service has a known requirement on the number of active sensors based on the historical service composition requests in the system, which may not be the case in practice. Furthermore, they only consider the sleep scheduling design for the sensors in the service provider overlay network and neglect the routing cost of service data. Authors of [14] try to identify the service composition that is less likely to be invalid in the near future due to nodes going to sleep mode. The goal is to minimize the recomposition cost. They make use of the dynamic programming to reduce total service composition cost when the minimum number of required service composition solutions is derived. However, the dynamic programming is unsuitable for large-scale problems.

The distributed nature of wireless sensor network results in spatial correlation among the sensed data. And data accuracy is accordingly influenced by the spatial correlationship. Under different assumptions, researchers have proposed several mathematical models for spatial correlation in wireless sensor networks. Some [18] assume that the sensed data follow diffusion property, and some [17] use an empirically obtained approximation function for the joint entropy of sensed data. The most commonly used model is the jointly Gaussian [9, 19, 20], which assumes the data to be jointly Gaussian with the correlation being a function of the distance. The jointly Gaussian model is easy to use and analyze. However, the chief limitation is that it forces the joint probability density function of the data values to be jointly Gaussian. Some researchers [21] use variograms to analyze spatial correlation in wireless sensor networks. The proposed model is Markovian in nature and can capture correlation in data irrespective of the node density, the number of source nodes, or the topology. Furthermore, this model derives the data value at a node from other correlated nodes whose data values have already been derived. However, it is not always the case that a given spatial process will be Markovian. Some others proposed correlation model for specific applications, such as soil moisture measurement in wireless underground sensor networks [10]. The presence of spatial correlation among sensor network data has been exploited for solving different problems. The authors in [15] proposed a traffic model for wireless sensor networks, which takes into account the statistical patterns of node mobility and spatial correlation. In [16, 17], spatial correlation was used to design energy-efficient data aggregation algorithms. Ma et al. [16] proposed a distributed clustering algorithm based on the dominating set theory to choose the cluster heads nodes and construct clusters by measuring the spatial correlation between sensors. Pattem et al. [17] studied the correlated data gathering problem and followed the idea of using an empirically obtained approximation function for the joint entropy of sources.

It is important and challenging to provide different services with data accuracy guaranteed through unreliable sensors nodes. Fault tolerance is one of the most important techniques, which has been taken into consideration in many works [22–27]. Han et al. [22] addressed the problem of deploying minimum number of relay nodes to achieve diverse levels of fault tolerance with higher network connectivity in the context of heterogeneous wireless sensor networks. However, they adopted the network model that in which nodes possess different transmission radius, while all of the relay nodes use an identical transmission radius. Banerjee et al. [23] investigated the event detection scheme with fault tolerance for multiple events occurring simultaneously. They proposed the use of polynomial-based scheme that addresses the problems of event region detection by having an aggregation tree of sensor nodes. However, their work is limited to static sensor and the network topology cannot adapt to the dynamic nature of simultaneous events with varied priorities.

Our work in this paper is concerned with the node selection algorithms, which is similar to the works that aim at dealing with node selection and assignment problems. Cai et al. [28] addressed the multiple directional cover sets problem of organizing the directions of sensors into a group of nondisjoint cover sets to extend the network lifetime. The directional sensors are different from common sensors that have a limited angle of sensing range. The authors proved this problem is NP-complete and presented three heuristic approaches. Lin et al. [29] proposed an adaptive energy-efficient multisensor scheduling scheme for collaborative target tracking in wireless sensor networks. The challenging issue of this problem is how to achieve energy efficiency and track reliability while satisfying the tracking accuracy requirement. In their algorithm, a number of sensors are selected to form a temporary tasking cluster, and the optimal sampling interval is determined to satisfy the given tracking accuracy. Johnson et al. [30] considered sensor-mission assignment problem in wireless sensor networks. In this problem, multiple missions compete for sensor resources. They showed that this problem is NP-hard even to approximate, and presented several heuristic algorithms. Liu et al. [31] studied the topology control problem using a probabilistic network model. They attempted to find a minimal transmission range for each node while the global network reachability satisfies certain threshold. Different from these previous works, we aim at providing efficient node selection algorithms with data accuracy guaranteed for the service-oriented wireless sensor networks by exploring the spatial correlation among the sensed data and the advantages of diverse services provided by different sensor nodes.

#### 3. System Model and Problem Formulation

In this section, we have firstly introduced the network model for the service-oriented wireless sensor networks. Secondly, we have described the spatial correlation model for single as well as multiple services in the network. Finally, we have formulated a definition for the node selection problem with data accuracy guaranteed in the service-oriented wireless sensor networks.

##### 3.1. Network Model

We consider a wireless sensor network in the plane with stationary nodes , which are built to provide a series of services . Each node can provide one or more service , which is a subset of ; that is, . One service, for example, , can be provided by a group of nodes , and . It is obvious that the set size demonstrates the number of nodes in the network which can provide service . The relationship between nodes and services can be further described as a bipartite graph , where denotes the set of nodes, denotes the set of services, and denotes the set of edges. There is an edge between and in case that can provide service ; that is, . Figure 2 has shown an example of the proposed model for service-oriented sensor network with five nodes and each service is supported by three distinct nodes.

To be convenient, the symbols used in this work are summarized in Table 1.

##### 3.2. Spatial Correlation Model

Researchers have proposed several mathematical models for spatial correlation in wireless sensor networks under different assumptions. Pattem et al. [17] proposed to use an empirically obtained approximation function for the joint entropy of sensed data. In [18], the sensed data is assumed to follow the diffusion property and the diffusion is formulated as a function of the distance. The jointly Gaussian is adopted in many related works [9, 19, 20], which assumes the data to be jointly Gaussian with the correlation as a function of the distance. The jointly Gaussian model is easy to use and analyze by forcing the joint probability density function of the data values to be jointly Gaussian. Jindal and Psounis [21] analyzed the spatial correlation among sensed data by using variograms in wireless sensor networks. The proposed model is a special case of Markov random field. In this model, the data value at a node is derived from other correlated nodes whose data values have already been derived. However, it is not always the fact that a given spatial process will be Markovian. In this paper, we are concerned with the data accuracy with the spatial correlation model. The jointly Gaussian model proposed in [9] has considered the measurement noise of nodes and given the distortion function, which is suitable for our problem.

In the senor networks, the observation result of each node is in fact a noisy version of the physical phenomenon located at the sensing source, and it can be modeled as Gaussian random variable of zero mean and variable ; that is, (). Similarly, the sensed data for the physical phenomenon at node can also be modeled as Gaussian variable , . Assume that the sensed data for node is denoted as accordingly. The correlations between and , and are described as where denotes the Euclidean distance between and the sensing source , denotes the Euclidean distance between and , is the covariance function concerned with the Euclidian distance and it is formulated as where controls the correlation between the distances of sensor nodes. In addition, we can see that = 1 in case that , and in case that .

The collected data by the sensor node is often subject to noise interference originated from the environment, and it can be represented as where is the additive white Gaussian noise, (). We assume that the noise that each sensor node encounters is independent of each other.

According to [9], the distortion of the estimation for is formulated as where () is the number of sensor nodes.

We use to normalize , and the estimated data accuracy is calculated as where denotes the Signal-to-Noise Ratio (SNR).

##### 3.3. Problem Definition

In this paper, we study the problem of node selection in the service-oriented wireless sensor networks with the data accuracy requirement guaranteed for the services. The number of nodes is considered as the optimizing object due to the following considerations. Firstly, there are fewer packets to be transmitted in the network if we select less number of nodes to provide services, which is also helpful to reduce the energy consumption. Secondarily, it will increase the collision probability in the contention-based wireless network if too many nodes are kept awake, and significant retransmission cost and additional delay occur accordingly. Finally, it helps to reduce the overhead of data transmission to allow one node to provide multiple services simultaneously. In case that the data of different services is correlated, it can be compressed into a smaller packet; even in the uncorrelated case, it can still be transmitted in a single packet, and thereby it is helpful to reduce overhead in the network [32].

In this paper, we aim at providing node strategies with the number of selected nodes minimized. Let be the data accuracy requirement of each service , one subset of nodes selected from to provide service , , and the estimated data accuracy of service when service is provided by nodes in set . The node selection problem for the service-oriented wireless sensor networks can be defined as follows: given a bipartite graph , in which denotes the set of nodes, denotes the set of services, and denotes the set of edges between and , and given the required data accuracy requirement of each service and spatial correlation among these nodes and corresponding sensing sources, the problem is to find a subgraph , , , and the objective is to minimize the number of selected nodes in under the constraint that is satisfied for each service .

#### 4. Integer Nonlinear Programming Formulation

In this section, we present an Integer Nonlinear Programming (INLP) formulation for the node selection problem. Integer programming is a mathematical optimization or a feasibility program in which some or all of the variables are restricted to be integers. INLP is a special case of integer programming, where some of the constraints or the objective functions are nonlinear. INLP is considered as an efficient technique to solve the optimization problem with nonlinear constraint, so that it is feasible to express the node selection problem as INLP. This paper is concernd with the node selection problem, and the objective is to minimize the total number of selected nodes with nonlinear data accuracy constraint. We use the following set of binary integer (0 or 1) variables and constraints in the INLP formulation.

Variables for each node and service . The variable is 1 if and only if ; that is, node can provide service : The variables can be obtained when the topology graph and the set of services provided by nodes are given. Obviously, the selected nodes for services shall be selected from these nodes with .

(2) Variables for each node and service . The variable is 1 if and only if node is assigned to provide service : As we can see, the set had indicated one scheduling scheme for the given wireless sensor networks, in which each node is assigned to provide service if . Note that equals 0 in case that , which means that node cannot be assigned to provide service since it is not supported. In this way, we have the following constraint: Let be the estimated data accuracy of service , and can be obtained via formula (5), which can be further described as follows: where and denotes the number of selected nodes which are assigned to provide service .

In order to satisfy the required data accuracy for all services in the network, we have the following constraint:

Variables . The variable is 1 if and only if node is selected to provide services required in the network: By following the above definition, equals 0 if node is not selected to provide any service, and otherwise it equals 1. As mentioned above, the variable denotes the case that node can provide service or not, and we have the following constraint: The objective of node selection problem is to minimize the total number of nodes that are selected to provide the required services, and the number of selected nodes can be calculated as .

Then the node selection problem discussed in this paper can be formulated as
In this section, we have introduced an Integer Nonlinear Programming (INLP) formulation for the node selection problem. The proposed INLP is generally considered as an efficient way to provide an accurate description on the problem formulation. This formulation is useful to find the optimal solution in case that the solution space is small enough with the help of some well-known tools, such as LINGO and MATLAB. However, INLP is a special case of integer programming which is proved to be NP-hard, and accordingly the INLP problem is NP-hard. Many related works have also been proposed to find the suboptimal solution for a given INLP problem [33–35]. Although INLP has shown its good performance in the practical applications, it results in some well-known deflects such as computation as well as space complexity, especially in case that the solution space is very large. Unfortunately, the wireless sensor network generally includes hundreds to thousands of nodes; the variables required by the INLP mentioned above might increase exponentially with the node number. Another problem is that for a random network, it is hard to gather all the constraints mentioned above since these nodes are heterogonous and constraints for each node are fully different from each other. Moreover, in the practical applications, the sink is almost impossible to get the *accurate* information in the harsh environment by following the observation that the sensed data is generally a noisy version of the physical phenomenon. It is more practical to adopt a suboptimal solution with the data accuracy guaranteed instead of optimal one that is hard to find for an NP-hard problem. In this way, it is reasonable and necessary to develop heuristic algorithms for the node selection problem in the service-oriented wireless sensor networks.

#### 5. Heuristic Algorithms

Heuristic algorithms are generally considered as an important way to solve the NP-hard problem. In this section, we propose two heuristic algorithms for the node selection problem in the service-oriented wireless sensor networks, namely, the Separate Selection Algorithm (SSA) and the Combined Selection Algorithm (CSA).

##### 5.1. Separate Selection Algorithm (SSA)

The basic idea of the Separate Selection Algorithm (SSA) is that we select a minimum number of nodes for each required service with the data accuracy guaranteed in a separate way, and the union of selected nodes for all services is considered as the problem solution. The key process for the SSA is how to select nodes for each service. Here we follow the idea with which nodes are selected in a sequence way, and in each step, we will choose the node that is potential to improve the data accuracy.

Assume that the current node selection solution is , in which , is the set of services, and . Let us consider the general case that one node, that is, is considered to provide one special service, that is, . Here we use to denote the data accuracy increment for service in case that node is selected to provide service , where . can be calculated as follows.(1)In case that , we have , which means that the data accuracy requirement for service has already been satisfied that there is no more improvement once nodes and are added into the final solution.(2)In case that , we have , which means that the data accuracy of service cannot be increased once nodes and are added into the final solution.(3)In case that , we have , which denotes the increment of the data accuracy for service in case that nodes and are added into the final solution.(4)In case that , we have , which means that we generally neglect the part of increment that exceeds the requirement since the solution is required only to provide the asked data accuracy.

The pseudocode for SSA is listed in Table 1. In Line 1, the set of selected nodes for each service is initially set as . In Lines 2–9, the algorithm tries to select nodes for each service . In case that the current data accuracy cannot satisfy the data accuracy requirement, that is, , we firstly check all the candidate nodes that are useful to improve data accuracy. Secondarily, we select one node with maximum data accuracy increment as the candidate (in Line 4). Finally, the selected node is added into to provide service (in Line 6). This process continues until enough nodes are selected for all services.

We illustrate the SSA algorithm by an example given in Figure 3. As we can see from Figure 3(a), the network has four nodes and each service is supported by three distinct nodes. The required data accuracy for each service is listed on the right side; for example, the required data accuracy of is 0.8, and the data accuracy increment of each node when services is provided only by this node is also listed on the left side; for example, the data accuracy increment of is 0.5 for and 0.0 for . Without loss of generality, we assume that the data accuracy increment of each node can be added directly to simplify description; that is, the data accuracy of provided by and is that equals 0.9. The algorithm will select nodes for and then . Node with the maximum data accuracy increment for will be selected firstly. In case that the algorithm has selected , both node and can guarantee the required data accuracy of , and we assume that the randomly selected one is . Similarly, service will firstly select . In case that the algorithm has selected , both and can guarantee the required data accuracy of . However, is intended to be selected by in case that it is already selected by . After that, the selected nodes have guaranteed the required data accuracy of and , and the algorithm will stop. As we can see from the final solution shown in Figure 3(b), the SSA algorithm selects three nodes to provide and with one node reduced.

##### 5.2. Combined Selection Algorithm (CSA)

The previous proposed SSA tries to select nodes for each separate service based on the criterion of data accuracy increment. However, some nodes will provide several services simultaneously in the wireless sensor networks, and this multiservice property can help to improve the performance of node selection strategies if we simply select one multiservice node to improve the data accuracy required by different services. As we can see from Line 4 in Algorithm 1, we intend to select the candidate node that is already chosen to provide some other service in SSA algorithm, which means that nodes with multiservice property are more preferred in SSA algorithm during node selection process. However, this separate selection process is not always efficient especially in some cases. For example, the sample network given in Figure 3 can obtain a better solution than the solution found by SSA. If we do not select and which have maximum data accuracy increment for one special service, but select and which can provide service and simultaneously, it is obvious that this solution can guarantee the required data accuracy of and . However, this solution has fewer nodes than SSA because it only selects two nodes with two nodes reduced. In this section, we will introduce a new Combined Selection Algorithm (CSA) that has utilized multiservice property for node selection problem in service-oriented wireless sensor networks.

In this paper, we aim at minimizing the number of selected nodes with the data accuracy guaranteed for all services in the network. There are two important factors that will influence the number of selected nodes; that is, the number of services and the service quality of each node. Intuitively, it helps to reduce the number of selected nodes in case that they can provide more kinds of services since more nodes are potential candidates during the selection process. However, nodes might have poor data accuracy when they are far away from the sensing source although they can provide the required services. It means that we shall consider the data accuracy as well as the number of services simultaneously during the node selection process.

The basic idea of the CSA is described as follows. Initially no nodes are selected and the final solution is an empty set. Then, nodes are chosen and added into the final solution in a sequence way. In each step, we intend to select the node with maximal *contribution increment* to all services in the network (which will be discussed in details below in this section). This process continues until the data accuracy for all services is satisfied, and finally we can obtain the selected nodes as well as the services provided by each node for the problem.

Assume that is considered to provide services in the current selection bipartite graph . In case that , it is obvious that there is no benefit for node to provide , and we have . In case that , we can see that node helps to improve data accuracy of service , and we have . In this way, for a given , we can calculate the for each . Generally, we intend to choose the node that has more contribution increment to the data accuracy. Here we have introduced () as a coefficient to demonstrate the impact of current contribution increment on the final data accuracy, and it is formulated as As we can see from the above formulation, in case that , which shows that node has no contribution to the data accuracy of service . Here we adopt the power exponential function to indicate how much the data accuracy is close to the requirement value.

Let be the contribution increment of node with the current selection bipartite graph in case that is selected to provide services, and is formulated as where () is a coefficient and denotes the data accuracy increment for service in case that node is selected to provide service , which is as same as that in Section 5.1.

So far we have introduced the basic node selection process for the CSA. However, the algorithm can be further optimized. In case that the algorithm selects one node with maximum contribution increment, this node will provide each service that helps to improve the data accuracy. However, the node with multiservice and maximum contribution increment might have poor data accuracy increment for some services. Although the node that provides poor data accuracy increment can still improve the data accuracy, the data accuracy increment is so small that it needs to select more nodes to guarantee the required data accuracy. Therefore, we can further reduce the number of nodes by removing some already selected nodes that are with poor data accuracy increment for some services. Let us consider an example that two services are supported by two nodes, in which can support and , can support , and the required data accuracy for and is both 0.8. Suppose that the data accuracy of provided by is 0.8, the data accuracy of provided by is 0.3, provided by is 0.8, and provided by and is 0.7. In the first selection, the value of is larger than according to formula (15), then will be selected and provide service and . In the next selection, will be selected and provide service . However, the data accuracy of provided by and is less than that provided by . It is clear that we can improve the data accuracy of service if we let do not provide . Note that the “bad” assignments (i.e., assigning nodes to provide services that are with poor data accuracy increment) cannot be eliminated during the selection process, due to the fact that they still help to improve the data accuracy. After selecting a new node, we can check all the selected nodes to find the “bad” assignments that were included in the previous selections. The basic idea of the optimization process is described as follows. In case that there is a special service for example, , has selected a new node, we will firstly calculate for each node , and select the one with maximum and ; after that, we let do not provide . This process continues until no more of this kind of nodes can be found from .

The pseudocode for CSA is listed in Algorithm 2. In Line 1, the set of selected nodes for each service is initialized as . In Lines 2–15, nodes are selected in a sequence way until data accuracy is guaranteed for all services. In case that there is some service that current data accuracy cannot satisfy its requirement, we will firstly calculate all the candidate nodes’ contribution increment, and select one node with maximum contribution increment as the candidate (in Line 4). Secondarily, the candidate node is assigned to provide each service that helps to improve the data accuracy (in Line 7). Finally, we check all the nodes in and remove nodes from without declining the data accuracy of service , and this subprocess continues until no more of this kind of nodes can be found from (in Line 8–12). The node selection process continues until enough nodes are selected for all services.

We illustrate the execution of CSA algorithm during one round of the iteration process by an example given in Figure 4. In this example two services are supported by four nodes. The node-service bipartite graph is given in Figure 4(a), and the current node-service selection bipartite graph is given in Figure 4(b). As we can see from Figure 4(b), the algorithm has selected in , then the available nodes are , , and . Suppose that the contribution increment of each available node has been calculated and the value of , , and is 0.1, 0.3, and 0.2, respectively. The relationship between available nodes and services is given in Figure 4(c), and there is an edge between and indicating that the data accuracy increment of is larger than 0, that is, helps to improve the data accuracy of . In the next step, the algorithm will select one node with maximum contribution increment, that is, in this example. According to Figure 4(c), helps to improve the data accuracy of and . Then will provide and , and the new solution is given in Figure 4(d). After a new node is added into the solution, the algorithm will execute an optimization process. We assume that there is one “bad” assignment ; that is, the data accuracy of provided by is not less than that provided by and . Then the algorithm will remove the assignment (, ) from the solution. The final solution of this round of the iteration process can be observed from Figure 4(e).

##### 5.3. Complexity Analysis

Lemma 1. *The time complexity of SSA is .*

*Proof. *During the outside for loop, the algorithm selects nodes for each service , and each execution of the outside for loop contains a while loop. In the while loop, the algorithm checks each node , and and selects one node with maximum data accuracy increment. The while loop will continue until the data accuracy of is satisfied. Because there are at most nodes that can provide and each execution selects one node, the execution of the while loop takes time. Hence, the time complexity of SSA is .

Lemma 2. * The time complexity of CSA is .*

*Proof. * During the outside while loop, the algorithm will firstly check each node , , , and calculate the data accuracy increment for each services and node’s contribution increment. Because each node can provide at most services and each execution selects one node with maximum contribution increment, this process takes at most time. In the next step, the loop is executed to assign the selected node to provide each service that helps to improve the data accuracy, and there are at most services. In each execution of the inside while loop, the algorithm checks each node in and tries to find one node that without declining the data accuracy of when this node does not provide . The inside while loop will continue until no more of this kind of nodes can be found. Because each contains at most nodes and each execution of the inside while loop selects one node, the inside while loop takes at most time and for loop is . Therefore, each execution of the outside while loop takes at most time. Because there are at most nodes to be selected, the time complexity of CSA is .

#### 6. Simulation Results and Analysis

In order to evaluate the actual behavior of the above algorithms, we have relied on the experimental simulation to show its performance. In this section, we have firstly introduced the building process of our simulation and then analyze the impact of spatial correlation parameters and SNR on the results. Finally, we compare the performance of SSA and CSA in different environments.

##### 6.1. Simulation Setup

We use MATLAB as the platform tool that is used popularly in simulation of wireless networks. The scenarios are built in a square area 500 m × 500 m. The sensor nodes are random placed as well as the sensing sources. Here we assume that each sensing source is dedicated to one special service. Given the sensor nodes and the sensing sources, in the next step we need to decide the set of services provided by these nodes. Here we adopt the randomly model to determine whether node can provide service with a given probability ratio (); that is, only provides in case that the random value (between 0 and 1) is larger than . Here we also assume that each service is provided by at least one node. Otherwise, the scenario is rebuilt until this constraint is satisfied. And the data accuracy for each service is assumed to be identical. In this work, we build 100 different scenarios and compare the average performance of the proposed algorithms.

##### 6.2. Impact of Spatial Correlation Parameters and SNR

In this part, we analyze the impact of spatial correlation parameters and SNR on the performance of the SSA and CSA. The spatial correlation parameters and SNR are two parameters in the spatial correlation model, which we have introduced in Section 3.2. The spatial correlation parameter denotes the correlation of sensed data between the distances among sensor nodes. As we can see from formula (2), the larger indicates a high degree of spatial correlation; that is, the nodes in a network provide strongly correlated service. The SNR denotes the noise strength that will affect the distortion of service. It is obvious that the larger will result in low distorted sensed data; that is, the services provided by nodes are more accurate. As we can see, the two parameters and will affect the sensed data, which in turn influences the selection results.

The first set of experiments is concerned with the impact of spatial correlation parameter on the number of selected nodes. The simulation is done with 300 nodes and 10 services, and the SNR is assumed to be 10 dB and be 0.5. The spatial correlation parameter varies from 500, 1000, and 2000 to 5000, and we study the average number of the selected nodes compared with the change of services’ accuracy requirement that starts from 0.7 to 0.97. As we can see from Figure 5, the number of selected nodes is minimized in case that , and it increases together with the increasing of accuracy requirement . However, this process is not so significant until reaches some special point. For example, the average number of selected nodes is among 1.79 to 4.34 when in case that and using CSA algorithm to find solution; however, it increases rapidly when . Moreover, there might have not been enough nodes to support the required data accuracy requirement; for example, the maximum data accuracy requirement is about 0.87 in case that .

The second set of experiments is concerned with the impact of signal-to-noise ratio on the number of selected nodes, which is illustrated in Figure 6. The simulation is done with 300 nodes and 10 services, and spatial correlation parameter is assumed to be 2000 and be 0.5. The SNR parameter varies from 5 dB, 10 dB, and 15 dB to 20 dB, and we study the average number of selected nodes compared with the change of services’ accuracy requirement that starts from 0.7 to 0.96. We also can see that the number of selected nodes remains stable or varies linearly when is smaller; however, it increases rapidly when is larger than some special point. This conclusion is similar to that of Figure 5. As we know, the energy budget is an important criterion for the wireless sensor networks, and it will worsen the network performance if too many nodes are involved in the data sensing process. The compromise from a given application scenario will help to reduce the energy consumption by selecting a proper accuracy requirement.

##### 6.3. Performance Comparison between SSA and CSA

So far as we know, this is the first works concerned with the node selection algorithms with data accuracy guaranteed for the service-oriented wireless sensor networks. Most of the related works [28–31] focused on different research issues, such as target tracking, and topology control. Wang et al. [13] had proposed a scheduling algorithm for the service-oriented wireless sensor network, but it did not consider the data accuracy. In this section, we compare the performance of SSA and CSA in different scenarios with varied accuracy requirement, number of nodes , number of service , and the value of , respectively.

Figure 7 has shown the number of selected nodes with SSA and CSA when the accuracy requirement varies from 0.7 to 0.95. The simulation is done with 300 nodes and 10 services, and spatial correlation parameter is assumed to be 2000, SNR to be 10 dB, and to be 0.5. The experimental results show that CSA has better performance compared with SSA in all situations.

The second set of simulations is done to show the impact of network size on the number of selected nodes. The simulation is done with 300 nodes and 10 services, and spatial correlation parameter is assumed to be 2000, data accuracy requirement to be 0.92, SNR to be 10 dB, and be 0.5. And the network size varies from 100 to 500. As we can see from Figure 8, the CSA has better performance than SSA in all cases. Furthermore, we have two observations from Figure 8. The average number of the selected nodes is relatively smaller in case that the network size is larger. This is due to the fact that there are more potential candidates for a given service with the network size increasing, and it helps to reduce the number of selected nodes. The number of the selected nodes decreases slightly in case that the network size reaches some special point. It implies that it is helpless to reduce the number of selected nodes by adding more nodes into the network.

The third set of simulations focuses on the impact of the number of services in the network on the number of selected nodes. We use the similar parameters in the second set of simulations. As we can see from Figure 9, CSA runs better than SSA with different value of although it is not so significant when is close to 5.

The fourth set of simulations focuses on the probability on the number of selected nodes by varying from 0.1 to 0.9. We also use the similar parameters in the second set of simulations. In fact, the parameter indirectly represents the number of services provided by nodes in the network. As we can see from Figure 10, the number of the selected nodes is rather close with SSA and CSA when is small enough. Particularly, the SSA is even slightly better than CSA when . However, the CSA shows better performance when the value of increases. Meanwhile, we can also obtain two conclusions from this set of experiments. The average number of the selected nodes decreases with increasing. The larger results in more services that can be provided by selected nodes. Thus, each node can make more contribution to the required services, which in turn reduces the total number of selected nodes. In case that is larger enough, for example, , the average number of selected nodes decreases slowly with increasing.

#### 7. Conclusion

To provide various services is one important trend for the future wireless sensor networks, and the service-oriented architecture allows different services supported simultaneously in the same physical area in which one sensor can provide different kinds of service. Quality of services, such as data accuracy, is one of the key criterions for applications because the sensed data is generally a noisy version of the physical phenomenon. The spatial correlation among the sensed data makes it possible to select a subset of nodes to provide the required services while the data accuracy is guaranteed, which is obviously helpful to improve the performance of the wireless sensor networks. We are concerned with this issue in this paper and have formulated the node selection problem into an Integer Nonlinear Programming (INLP) problem. We also have developed two heuristic algorithms, namely, Separate Selection Algorithm (SSA) and Combined Selection Algorithm (CSA) for the problem. In the future work we are to develop efficient scheduling schemes for the node selection process and aim at providing a solution for the service-oriented wireless sensor networks with the network lifetime maximized. The temporal correlation is also important to optimize the network performance. We also plan to explore energy-efficient scheduling schemes for service-oriented wireless sensor networks with both spatial and temporal correlation considered.

#### Acknowledgments

This work is supported by Fujian Provincial Natural Science Foundation of China under Grant no. 2011J01345, the Development Foundation of Educational Committee of Fujian Province under Grant no. 2012JA12027, the National Science Foundation of China under Grant no. 61103275, and the Technology Innovation Platform Project of Fujian Province under Grant no. 2009J1007.

#### References

- I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, “Wireless sensor networks: a survey,”
*Computer Networks*, vol. 38, pp. 393–422, 2002. - D. Gračanin, M. Eltoweissy, A. Wadaa, and L. A. DaSilva, “A service-centric model for wireless sensor networks,”
*IEEE Journal on Selected Areas in Communications*, vol. 23, no. 6, pp. 1159–1166, 2005. View at Publisher · View at Google Scholar · View at Scopus - X. Chu and R. Buyya, “Service oriented sensor web,” in
*Sensor Network and Configuration: Fundamentals, Techniques, Platforms, and Experiments*, N. P. Mahalik, Ed., pp. 51–74, 2007. - A. Rezgui and M. Eltoweissy, “Service-oriented sensor-actuator networks: promises, challenges, and the road ahead,”
*Computer Communications*, vol. 30, no. 13, pp. 2627–2648, 2007. View at Publisher · View at Google Scholar · View at Scopus - J. King, R. Bose, I. Y. Hen, S. Pickles, and A. Helal, “Atlas: a service-oriented sensor platform hardware and middleware to enable programmable pervasive spaces,” in
*Proceedings of the 31st Annual IEEE Conference on Local Computer Networks (LCN '06)*, pp. 630–638, Tampa, Fla, USA, November 2006. View at Publisher · View at Google Scholar · View at Scopus - Crossbow Technology, http://www.xbow.com/.
- J. Li and S. Cheng, “(
*ε*,*σ*)-approximate aggregation algorithms in dynamic sensor networks,”*IEEE Transactions on Parallel and Distributed Systems*, vol. 23, pp. 385–396, 2012. - C. Wang, H. Ma, Y. He, and S. Xiong, “Adaptive approximate data collection for wireless sensor networks,”
*IEEE Transactions on Parallel and Distributed Systems*, vol. 23, pp. 1004–1016, 2012. View at Publisher · View at Google Scholar - M. C. Vuran, Ö. B. Akan, and I. F. Akyildiz, “Spatio-temporal correlation: theory and applications for wireless sensor networks,”
*Computer Networks*, vol. 45, no. 3, pp. 245–259, 2004. View at Publisher · View at Google Scholar · View at Scopus - X. Dong and M. C. Vuran, “Spatio-temporal soil moisture measurement with wireless underground sensor networks,” in
*Proceedings of the IFIP Annual Mediterranean Ad Hoc Networking Workshop (Med-Hoc-Net '10)*, pp. 1–8, Juan Les Pins, France, June 2010. - E. Avilés-López and J. García-Macías, “TinySOA: a service-oriented architecture for wireless sensor networks,”
*Service Oriented Computing and Applications*, vol. 3, pp. 99–108, 2009. - J. M. Corchado, J. Bajo, D. I. Tapia, and A. Abraham, “Using heterogeneous wireless sensor networks in a telemonitoring system for healthcare,”
*IEEE Transactions on Information Technology in Biomedicine*, vol. 14, no. 2, pp. 234–240, 2010. View at Publisher · View at Google Scholar · View at Scopus - J. Wang, D. Li, G. Xing, and H. Du, “Cross-layer sleep scheduling design in service-oriented wireless sensor networks,”
*IEEE Transactions on Mobile Computing*, vol. 9, no. 11, pp. 1622–1633, 2010. View at Publisher · View at Google Scholar · View at Scopus - X. Wang, J. Wang, Z. Zheng, Y. Xu, and M. Yang, “Service composition in service-oriented wireless sensor networks with persistent queries,” in
*Proceedings of the 6th IEEE Consumer Communications and Networking Conference (CCNC '09)*, pp. 1–5, Las Vegas, Nev, USA, January 2009. View at Publisher · View at Google Scholar · View at Scopus - P. Wang and I. F. Akyildiz, “Spatial correlation and mobility-aware traffic modeling for wireless sensor networks,”
*IEEE/ACM Transactions on Networking*, vol. 19, pp. 1860–1873, 2011. - Y. Ma, Y. Guo, X. Tian, and M. Ghanem, “Distributed clustering-based aggregation algorithm for spatial correlated sensor networks,”
*IEEE Sensors Journal*, vol. 11, no. 3, pp. 641–648, 2011. View at Publisher · View at Google Scholar · View at Scopus - S. Pattem, B. Krishnamachari, and R. Govindan, “The impact of spatial correlation on routing with compression in wireless sensor networks,”
*ACM Transactions on Sensor Networks*, vol. 4, no. 4, article 24, 2008. View at Publisher · View at Google Scholar · View at Scopus - J. Faruque and A. Helmy, “RUGGED: routing on fingerprint gradients in sensor networks,” in
*Proceedings of IEEE International Conference on Pervasive Services (ICPS '04)*, pp. 179–188, Beirut, Lebanon, July 2004. - M. C. Vuran and Ö. B. Akan, “Spatio-temporal characteristics of point and field sources in wireless sensor networks,” in
*Proceedings of IEEE International Conference on Communications (ICC '06)*, pp. 234–239, Istanbul, Turkey, July 2006. View at Publisher · View at Google Scholar · View at Scopus - D. Zordan, G. Quer, M. Zorzi, and M. Rossi, “Modeling and generation of space-time correlated signals for sensor network fields,” in
*Proceedings of IEEE Global Telecommunications Conference (GLOBECOM '11)*, pp. 1–6, Houston, Tex, USA, December 2011. - A. Jindal and K. Psounis, “Modeling spatially correlated data in sensor networks,”
*ACM Transactions on Sensor Networks*, vol. 2, no. 4, pp. 466–499, 2006. View at Publisher · View at Google Scholar · View at Scopus - X. Han, X. Cao, E. L. Lloyd, and C. C. Shen, “Fault-tolerant relay node placement in heterogeneous wireless sensor networks,”
*IEEE Transactions on Mobile Computing*, vol. 9, no. 5, pp. 643–656, 2010. View at Publisher · View at Google Scholar · View at Scopus - T. Banerjee, B. Xie, and D. P. Agrawal, “Fault tolerant multiple event detection in a wireless sensor network,”
*Journal of Parallel and Distributed Computing*, vol. 68, no. 9, pp. 1222–1234, 2008. View at Publisher · View at Google Scholar · View at Scopus - N. Xiong, A. Vasilakos, L. Yang et al., “Comparative analysis of quality of service and memory usage for adaptive failure detectors in healthcare systems,”
*IEEE Journal on Selected Areas in Communications*, vol. 27, no. 4, pp. 495–509, 2009. View at Publisher · View at Google Scholar · View at Scopus - N. Xiong, A. V. Vasilakos, J. Wu et al., “A self-tuning failure detection scheme for cloud computing service,” in
*Proceedings of IEEE 26th International Parallel & Distributed Processing Symposium (IPDPS '12)*, pp. 668–679, Shanghai, China, May 2012. - G. Wu, C. Lin, F. Xia, L. Yao, H. Zhang, and B. Liu, “Dynamical jumping real-time fault-tolerant routing protocol for wireless sensor networks,”
*Sensors*, vol. 10, no. 3, pp. 2416–2437, 2010. View at Publisher · View at Google Scholar · View at Scopus - J. Bu, M. Yin, D. He, F. Xia, and C. Chen, “SEF: a secure, efficient, and flexible range query scheme in two-tiered sensor networks,”
*International Journal of Distributed Sensor Networks*, vol. 2011, Article ID 126407, 12 pages, 2011. View at Publisher · View at Google Scholar - Y. Cai, W. Lou, M. Li, and X. Y. Li, “Energy efficient target-oriented scheduling in directional sensor networks,”
*IEEE Transactions on Computers*, vol. 58, no. 9, pp. 1259–1274, 2009. View at Publisher · View at Google Scholar · View at Scopus - J. Lin, W. Xiao, F. L. Lewis, and L. Xie, “Energy-efficient distributed adaptive multisensor scheduling for target tracking in wireless sensor networks,”
*IEEE Transactions on Instrumentation and Measurement*, vol. 58, no. 6, pp. 1886–1896, 2009. View at Publisher · View at Google Scholar · View at Scopus - M. P. Johnson, H. Rowaihy, D. Pizzocaro et al., “Sensor-mission assignment in constrained environments,”
*IEEE Transactions on Parallel and Distributed Systems*, vol. 21, no. 11, pp. 1692–1705, 2010. View at Publisher · View at Google Scholar · View at Scopus - Y. Liu, L. Ni, and C. Hu, “A generalized probabilistic topology control for wireless sensor networks,”
*IEEE Journal on Selected Areas in Communications*, vol. 30, no. 9, pp. 1780–1788, 2012. - E. Fasolo, M. Rossi, J. Widmer, and M. Zorzi, “In-network aggregation techniques for wireless sensor networks: a survey,”
*IEEE Wireless Communications*, vol. 14, no. 2, pp. 70–87, 2007. View at Publisher · View at Google Scholar · View at Scopus - W. Zhu and H. Fan, “A discrete dynamic convexized method for nonlinear integer programming,”
*Journal of Computational and Applied Mathematics*, vol. 223, no. 1, pp. 356–373, 2009. View at Publisher · View at Google Scholar · View at Scopus - W. Zhu and G. Lin, “A dynamic convexized method for nonconvex mixed integer nonlinear programming,”
*Computers and Operations Research*, vol. 38, no. 12, pp. 1792–1804, 2011. View at Publisher · View at Google Scholar · View at Scopus - M. Schlüter, J. A. Egea, and J. R. Banga, “Extended ant colony optimization for non-convex mixed integer nonlinear programming,”
*Computers and Operations Research*, vol. 36, no. 7, pp. 2217–2229, 2009. View at Publisher · View at Google Scholar · View at Scopus