Applied Computational Intelligence and Soft Computing

Volume 2016 (2016), Article ID 3217612, 11 pages

http://dx.doi.org/10.1155/2016/3217612

## Local Community Detection Algorithm Based on Minimal Cluster

School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221008, China

Received 14 June 2016; Accepted 11 October 2016

Academic Editor: Wu Deng

Copyright © 2016 Yong Zhou et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

In order to discover the structure of local community more effectively, this paper puts forward a new local community detection algorithm based on minimal cluster. Most of the local community detection algorithms begin from one node. The agglomeration ability of a single node must be less than multiple nodes, so the beginning of the community extension of the algorithm in this paper is no longer from the initial node only but from a node cluster containing this initial node and nodes in the cluster are relatively densely connected with each other. The algorithm mainly includes two phases. First it detects the minimal cluster and then finds the local community extended from the minimal cluster. Experimental results show that the quality of the local community detected by our algorithm is much better than other algorithms no matter in real networks or in simulated networks.

#### 1. Introduction

Community detection on complex networks has been a hot research field. Recently, a large number of algorithms for studying the global structure of the network are proposed, such as the modularity optimization algorithms [1, 2], the spectral clustering algorithms [3–6], the hierarchical clustering algorithms [7–10], and the label propagation algorithms [11–14]. However, with the continuous expansion of complex networks, it is easy to collect large network dataset with millions of nodes. How to store such a large-scale dataset in computer memory to analyze is a huge challenge for scholars. The calculation for studying the overall structure of this kind of large-scale networks is unimaginable. So local community detection becomes an appealing problem and has drawn more and more attention [15–18]. The main task of local community detection is to find a community using the local information of the network. Local community detection has good extensibility. If the local community detection algorithm is iteratively executed, more local communities can be found and the whole community structure of the network can be obtained. The time complexity of this kind of global community detection algorithm is dependent on the efficiency and accuracy of local community detection algorithms, so the research of local community detection algorithm still has a long way to go. There are several problems that need to be solved in the research of local community detection. First, we should determine the initial state and find the initial node for local community detection, so as to determine the needed local information; then, we need to select an objective function, and through continuous iterative optimization of the objective function we find the community structure with high quality; after that we need to find a suitable node expansion method, so that the algorithm can extract the local community from the initial state step by step; finally, in order to terminate the algorithm, a suitable termination condition is needed to determine the boundary of the community.

Most of local community detection algorithms are based on the above-mentioned process. The definition of local community detection is to find the local community structure from one or more nodes, but most of the existing local community detection algorithms, including Clauset [15], LWP [16], and LS [17], are starting from only one initial node. They greedily select the optimal nodes from the candidate nodes and add them into the local community. LMD [18] algorithm extends not from the initial node but from its closest and next closest local degree central nodes. It discovers a local community from each of these nodes, respectively. It still starts from single node and discovers many local communities for the initial node. In general, the aggregation ability of a single node is lower than that of multiple nodes. So we do not just rely on the initial node as the beginning of local community expansion. Our primary goal is to find a minimal cluster closely connected to the initial node and then detect local community based on the minimal cluster. This can avoid instability because of the excessive dependence on the initial node. In this paper, we introduce a local community detection algorithm based on the minimal cluster—NewLCD. In this new algorithm, the beginning of community expansion is no longer from the initial node only, but a cluster of nodes relatively closely connected to the initial node. The algorithm mainly consists of two parts: one is the detection of the minimal cluster, and the other is the detection of the local community based on the minimal cluster. At the same time, the algorithm can be applied to the global community detection. After finding one local community using this algorithm, we can repeat the process to obtain the global community structure of the whole network.

#### 2. Related Works of Local Community Detection

##### 2.1. Definition of Local Community

The problem of local community detection is proposed by Clauset [15]. Usually we define the local community problem in the following way: there is a nondirected graph , represents the set of nodes, and represents the edges in the graph. The connecting information of partial nodes in the graph is known or can be obtained. The local community is defined as . The set of nodes connected with is defined as and the set of nodes in connected with nodes in is defined as the boundary node set . That is to say, any node in is connected to one node in , and the rest of is the core node set , as shown in Figure 1.