Data Science and AIbased Optimization in Scientific Programming
View this Special IssueResearch Article  Open Access
Zhen Tan, Xiang Zhao, Yang Fang, Bin Ge, Weidong Xiao, "Knowledge Graph Representation via SimilarityBased Embedding", Scientific Programming, vol. 2018, Article ID 6325635, 12 pages, 2018. https://doi.org/10.1155/2018/6325635
Knowledge Graph Representation via SimilarityBased Embedding
Abstract
Knowledge graph, a typical multirelational structure, includes largescale facts of the world, yet it is still far away from completeness. Knowledge graph embedding, as a representation method, constructs a lowdimensional and continuous space to describe the latent semantic information and predict the missing facts. Among various solutions, almost all embedding models have high time and memoryspace complexities and, hence, are difficult to apply to largescale knowledge graphs. Some other embedding models, such as TransE and DistMult, although with lower complexity, ignore inherent features and only use correlations between different entities to represent the features of each entity. To overcome these shortcomings, we present a novel lowcomplexity embedding model, namely, SimEER, to calculate the similarity of entities in independent and associated spaces. In SimEER, each entity (relation) is described as two parts. The entity (relation) features in independent space are represented by the features entity (relation) intrinsically owns and, in associated space, the entity (relation) features are expressed by the entity (relation) features they connect. And the similarity between the embeddings of the same entities in different representation spaces is high. In experiments, we evaluate our model with two typical tasks: entity prediction and relation prediction. Compared with the stateoftheart models, our experimental results demonstrate that SimEER outperforms existing competitors and has low time and memoryspace complexities.
1. Introduction
Knowledge graph (KG), as an important part of the artificial intelligence, is playing an increasingly more essential role in different domains [1]: question answer system [2, 3], information retrieval [4], semantic parsing [5], named entity disambiguation [6], biological data mining [7, 8], and so on [9, 10]. In knowledge graphs, facts can be denoted as instances of binary relations (e.g., PresidentOf (DonaldTrump, American)). Nowadays, a great number of knowledge graphs, such as WordNet [11], Freebase [12], DBpedia [13], YAGO [14], and NELL [15] usually do not appear simultaneously. Instead, they were constructed to describe the structured information in various domains [16], and all of them are fairly sparse.
Knowledge representation learning [17–19] is considered as an important task to extract the latent features from associated space. Recently, knowledge embedding [20, 21], an effective method of feature extraction [22], was proposed to compress a highdimensional and sparse space into a lowdimensional and continuous space. Knowledge embedding can be used to derive new unknown facts from known knowledge bases (e.g., link prediction) and to determine whether a triplet is correct or not (e.g., triplets classification) [23]. Moreover embedding representation [24] has been used to support question answer systems [25] and machine reading [26]. However, almost all embedding models only use the features and attributes in knowledge graph to represent entities and relations, which omits the fact that entities and relations are projections of the facts in independent space. Besides, almost all of them have high time and memoryspace complexities and cannot be used in largescale knowledge graphs.
In this research, we propose a novel similaritybased knowledge embedding model, namely, SimEER, which calculates the entity and relation similarities between two spaces (independent and associated spaces). A sketch of the model framework is provided in Figure 1. The basic idea of this paper is that independent and associated spaces are used to represent the irrelevant and interconnected entities (relations) features, respectively. In independent space, the features of entities (relations) are independent and irrelevant. By contrast, the features of entities (relations) in associated space are interconnected and interacting, and the entities and relations can be denoted by the entities and relations connected with them. Plus, the similarities of the same entities (relations) with different spaces are high. In Figure 1, we can see that, in independent space, the features of are only constructed by themselves, but, in associated spaces, the entity is denoted by other entities and relations which can be described as blue points (lines). We want the features of in independent and associated spaces to be similar. Besides, vector embedding is used to represent knowledge graphs.
In associated space, take as an example the entity which Steve Jobs has multiple triplets, such as (Steve Jobs, Apple Inc., FoundOf), (Steve Jobs, America, Nationality), and (Steve Jobs, Laurene Powell, CoupleOf). If we combine all corrupt triplets with the same missing entity, such as (…, Apple Inc., FoundOf), (…, America, Nationality), and (…, Laurene Powell, Couple), it is easy to locate that the missing entity is Steve Jobs. Similarly, if we combine all the corrupt triplets with the same relation, such as (Steve Jobs, Apple Inc., …), (Jack Ma, Alibaba, …), and (Sundar Pichai, Google, …), we can obtain that the missing relation is FoundOf. The scenario is shown in Figure 2. Hence using correlation between different entities to represent features is an effective method. However, in practice, it is unsuitable to only use the correction between different entities and omit the inherent features entities have, such as the attributes of each entity which are hard to represent with the correlations between different entities. Therefore, we construct the independent space which can preserve the inherent features each entity has. We combine both independent and associated spaces to represent overall features of entities and relations, which can in turn represent the knowledge graph more comprehensively. The motivation of employing both types of spaces is to model correlation while reserving individual specificity.
Compared with other embedding models, vector embedding has evident advantages on time and memoryspace complexities. We evaluate SimEE and SimEER on the popular tasks of entity prediction and relation prediction. The experiment results validate the competitive results achieved by the proposed method compared with previous models.
Contributions. To summarize, the main contributions of this paper are as follows:(i)We propose a similaritybased embedding model, namely, SimEER. In SimEER, we consider the entity and relation similarities of different spaces simultaneously, which can extract the features of entities and relations comprehensively.(ii)Compared with other embedding models, our model has lower time and space complexity, which improves the effectiveness of processing largescale knowledge graphs.(iii)Through thorough experiments on reallife datasets, our approach is demonstrated to outperform the existing stateoftheart models in entity prediction and relation prediction tasks.
Organization. We discuss related work in Section 2 and then introduce our method, along with the theoretical analysis, in Section 3. Afterwards, experimental studies are presented in Section 4, followed by conclusion in Section 5.
2. Related Work
In this section, we introduce several related works [19] published in recent years which get the stateoftheart results. According to the relation features, we divide embedding models into two parts: matrixbased embedding models [27] and vectorbased embedding models [28].
2.1. MatrixBased Embedding Models
In this part, matrices (tensors) are used to describe relation features.
Structured Embedding. Structured Embedding Model (SE) [29] considers that head and tail entities are overlapping in a specificrelation space where the triplet exists. It uses two mapping matrices and to extract feature from and .
Single Layer Model. Compared with SE, Single Layer Model (SLM) [30] uses a nonlinear activation function to translate the extracted features and considers the features after activation to be orthogonal with relation features. The extracted features are comprised of the entities’ features after mapping and a bias of their relation.
Neural Tensor Network. Neural Tensor Network (NTN) [30, 31] is a more complex model and considers that the tensor can be regarded as better feature extractor compared with matrices.
Semantic Matching Energy. The basic idea of Semantic Matching Energy (SME) [32] is that if the triplet is correct, the feature of head entity and tail entity is orthogonal. Similar to SLM, the features of head (tail) entity are comprised of the entities’ features after mapping and a bias of their relation. There are two methods to extract features, i.e., linear and nonlinear.
Latent Factor Model. Latent Factor Model (LFM) [33, 34] assumes that features of head entity are orthogonal with those of tail entity when the head entity is mapped in specificrelation space. Its score function can be defined as , where , , denote the features of head entity, relation, and tail entity, respectively.
2.2. VectorBased Embedding Models
In this part, relations are described as vector rather than matrix to improve the effectiveness of representation models.
TranslationBased Model. The basic idea of translationbased model, TransE [23, 35, 36], is that the relation is a translation vector between and . The score function is , where , , and denote the head entity, relation, and tail entity embeddings, respectively. Because TransE only processes simple relations, other translationbased models [37–39] are proposed to improve TransE.
Combination Embedding Model. CombinE [40] describes the relation features with the plus and minus combination of each pair. Compared with other translationbased models, CombinE can represent relation features in a more comprehensive way.
BilinearDiag Model. DistMult [41] uses a formulation of bilinear model to represent entities and relations and utilizes the learned embedding to extract logical rules.
Holographic Embedding Model. HOLE [42] utilizes a compositional vector space based on the circular correlation of vectors, which creates fixedwidth representations. The compositional representation has the same dimensionality as the representation of its constituents.
Complex Embedding Model. ComplEx [43] divides entities and relations into two parts, i.e., real part and imaginary part. Real part denotes the features of symmetric relation, and imaginary part denotes the features of asymmetric relations.
Project Embedding Model. ProjE [44], a shared variable neural network model, uses twodiagonal matrix to extract the entity and relation features and calculate the similarity between features and candidate entity. In training, the correct triplets have high similarity.
Convolutional Embedding Model. ConvE [45] transfers the features into 2D space and uses convolutional neural network to extract the entity and relation features.
Compared with matrixbased embedding models, vectorbased models have obviously advantages on time and memoryspace complexities. In these vectorbased models, TransE is a classical baseline and has been applied on many applications, TransR is an improved method of TransE which solves the complex relation types, and DistMult and ComplEx use probabilitybased method to represent knowledge and achieve stateoftheart results.
3. SimilarityBased Model
Given a training set of triplets, each triplet has two entities (the set of entities) and relationship (the set of relationship). Our model learns the entities embeddings (, , , ) and relationship embeddings (, ) to represent the feature of entities and relations, where the subscripts , denote the independent and associated space. The entity embedding and relation embedding take value in , where is the dimension of entity and relation embedding spaces.
3.1. Our Models
The basic idea of our model is that, for each entity (relation), the features are divided into two parts. The first part describes inherent features of entities (relations) in independent space. The feature embedding vectors can be denoted as , , . The second part signs triplet features in associated space, and the feature embedding vectors can be denoted as , , . In independent space, the feature vectors are described as the inherent features entities (relations) have. In associated space, the features of are comprised of other entities and relations which connect with entity .
The entities (relations) in associated space are projections of entities (relations) in independent space. Hence the representation features of the same entity in independent and associated space are similar, while the representation features of different entities are not similar. The formula can be described as follows:where denotes elementwise product. In detail, in (1), if we combine the features of and , we can obtain part of the features. That is to say, the features are similar with . In this paper, we use Cosine to calculate the similarity between different spaces. Taking head entity as an example, the Cosine similarity between different spaces can be denoted aswhere Dot denotes the dotproduct and Sum denotes the summation over the vector element. calculates the similarity, and constrain the length of features. To reduce the training complexity, we just consider the numerator and use regularization items to replace the denominator. Hence the similarity of head entity features in independent and graph spaces can be described as We expect that the value of is larger when and denote the same head entity, while the value of is smaller otherwise.
To represent entities in a more comprehensive way, we consider the similarity of head and tail entities simultaneously. The score function can be denoted as The embedding model based on the similarity of head and tail entities is named as SimEE.
On the basis of entity similarity, we need to consider relation similarity, which can enhance the representation of relation features. The comprehensive model, which considers all the similarities of entity (relation) features in different spaces, can be described as
The embedding model based on the similarity of entity and relation is named as SimEER.
3.2. Training
To learn the proposed embedding and encourage the discrimination between golden triplets and incorrect triplets, we minimize the following logistic ranking loss function over the training set: where corresponds to the embeddings , , , , , and is a label of triplet. denotes that is positive and denotes that is negative. is a triplets set [28] which contains both positive triplets set and negative triplets set .
The set of negative triplets, constructed according to (9), is composed of training triplets with either head (tail) entity or relation replaced by a random entity or relation. Only one entity or relation is replaced for each corrupted triplet with the same probability. To prevent overfitting, some constraints are considered when minimizing the loss function :
Equation (10) is to constrain the length of entity (relation) features for SimEE and SimEER. We convert it to the following loss function by means of soft constraints:where is a hyperparameter to weigh the importance of soft constraints. We utilize the improved stochastic gradient descent (Adagrad) [46] to train the models. Comparing with SGD, Adagrad shrinks learning rate effectively when the number of iterations increases, which means that it is insensitive to learning rate.
3.3. Comparison with Existing Models
To compare the time and memoryspace complexities between different models, we show the results in Table 1, where represents the dimension of entity and relation embeddings, is the number of tensor’s slices, and and are the numbers of entities and relations, respectively.

The comparison results are showed as follows:(i)Except for DistMult and TransE, the baselines use relation matrix to project entities’ features into relation space, which makes these models have high memoryspace and time complexities. Compared with these models, SimEE and SimEER have lower time complexity. SimEE and SimEER can be used on largescale knowledge graphs more effectively.(ii)In comparison to TransE, SimEE and SimEER can dynamically control the ratio of positive and negative triplets. It enhances the robustness of representation models.(iii)Compared with SimEE and SimEER, DistMult is a special case of them when we only consider single similarity of entity or relation. That is to say, SimEE and SimEER can extract the features of entities (relations) more comprehensively.
4. Experiments and Analysis
In this section, our models SimEE and SimEER are evaluated and compared with several baselines which have been shown to achieve stateoftheart performance. Firstly, two classical tasks are adopted to evaluate our models: entity prediction and relation prediction. Then, we use cases to verify the effectiveness of our models. Finally, according to the practical experimental results, we analyze the time and memoryspace costs.
4.1. Datasets
We use two reallife knowledge graphs to evaluate our method:(i)WordNet (https://wordnet.princeton.edu/download), a classical dictionary, is designed to describe correlation and semantic information between different words. Entities are used to describe the concepts of different words, and relationships are defined to describe the semantic relevance between different entities, such as instance hypernym, similar to, and member of domain topic. The data version we use is the same as [23] where triplets are denoted as (sway_2, has_instance, brachiate_1) or (felis_1, member_meronym, catamount_1). A subset of WordNet is adopted, named as WN18 [23].(ii)Freebase (code.google.com/p/wikilinks), a huge and continually growing knowledge graph, describes large amount of facts in the world. In Freebase, entities are described by labels, and relations are denoted by a hierarchical structure, such as “” and “”. We employ two subsets of Freebase, named as FB15K and FB40K [23].
We show the statistics information of datasets in Table 2. From Table 2, we see that, compared with WN18, FB15K and FB40K have more relationships and can be regarded as the typical largescale knowledge graphs.

4.2. Experiment Setup
Evaluation Protocol. For each triplet in the test set, each item of triplets (head entity or tail entity or relation) is removed and replaced by items in the dictionary in turn, respectively. Using score function to calculate these corrupted triplets and sorting the scores by ascending order, the rank of the correct entities or relations is stored. For relation in each test triplet, the whole procedure is repeated. In fact, we need to consider that some correct triplets are generated in the process of removing and replacement. Hence, we filter out the correct triplets from corrupted triplets which actually exist in training and validation sets. The evaluation measure before filtering is named as “Raw”, and the measure after filtering is named as “Filter”. We used two evaluation measures to evaluate our approach which is similar to [42]:(i)MRR is an improved measure of MeanRank [23] which calculates the average rank of all the entities (relations) and calculates the average reciprocal rank of all the entities (relations). Compared with MeanRank, MRR is less sensitive to outliers. We report the results using both Filter and Raw rules.(ii)Hits@ reports the ratio of correct entities in Topn ranked entities. Because the number of entities is much larger than that of relations, we take Hits@, Hits@, Hits@ for entity prediction task and take Hits@, Hits@, Hits@ for relation prediction task.
A stateoftheart embedding model should have higher MRR and Hits@.
Baselines. Firstly, we compare the proposed methods with CP which uses canonical polyadic decomposition to extract the entities and relation features; then we compare the proposed methods with TransE which considers that tail entity features are close to the combined features of head entity and relation. Besides TransR [47], ERMLP [48], DistMult [41], and ComplEx [43] are also used for comparison with our methods. We train CP [49], DistMult, ComplEx, TransE, and TransR using the codes provided by authors. We choose the length of dimension among , the weight of regularization among , the learning rate among , and the ratio of negative and correct samples among . The negative samples in different epochs are different.
Implementation. For experiments using SimEE and SimEER, we select the dimension of the entity and the relation among , the weight of regularization among , the ratio of negative and correct samples among , and the minibatch size among . We utilized the improved stochastic gradient descent (Adagrad) [46] to train the loss function. With the iteration epoch increasing, the learning rate in Adagrad is decreases, and Adagrad is insensitive to learning rate. The initial values of both SimEE and SimEER are generated by Random function, and the range is , where is the dimension of feature vector. Training is stopped using early stopping on the validation set MRR (using the Filter measure), computed every 50 epochs with a maximum of 2000 epochs.
In SimEE model, the optimal configurations on validation set are(i), , , on WN18,(ii), , , on FB15K,(iii), , , on FB40K.
In SimEER model, the optimal configurations on validation set are(i), , , on WN18,(ii), , , on FB15K,(iii), , , on FB40K.
Ttest. In experiments, for each model, we run 15 times independently and calculate the mean and standard deviation. Then we use Student's ttest with to compare the performance between different models, and the ttest can be shown as follows [50, 51].
and are mean and standard deviation on model 1 with run times; and are mean and standard deviation on model 2 with times. Then we can construct the hypothesis:And the ttest can be described as The degree of freedom () in tdistribution can be shown as follows:
In entity and relation prediction tasks, we calculate mean and standard deviation of MRR and Hit and compare their performance with ttest.
4.3. Link Prediction
For link prediction [52–54], we tested two subtasks—entity prediction and relation prediction. Entity prediction aims to predict the missing or entity from the fact triplet ; similarly, relation prediction is to determine which relation is more suitable for a corrupted triplet .
Entity Prediction. This set of experiments tests the models’ ability to predict entities. Experimental results of mean and plus/minus standard deviation on both WN18 and FB15K are shown in Tables 3, 4, and 5, and we can observe the following:(i)On WN18, a smallscale knowledge graph, ComplEx, achieves stateoftheart results on MRR and Hits@. However, on FB15K and FB40K, two largescale knowledge graphs, SimEE and SimEER, achieve excellent results on MRR and Hits@, and the values of Hits@ are up to 0.868 and 0.889, respectively. The outstanding results prove that our models can represent different kinds of knowledge graphs effectively, especially on largescale knowledge graphs.(ii)ComplEx is better than SimEER on WN18, and the reason is that ComplEx can distinguish symmetric and antisymmetric relationship contained in the relation structure of WN18. However, on FB15K and FB40K, SimEE and SimEER are better than ComplEx. The reason is that the number of relations is much larger than WN18, and the relation structure is more complex and hard to represent, which has obvious influence on the representation ability of ComplEx.(iii)The results of SimEE and SimEER are similar to each other. The largest margin is filtered MRR on FB15K at 0.013. The phenomenon demonstrates that both SimEE and SimEER can extract the entity features in knowledge graph and predict the missing entities effectively.(iv)Compared with DistMult, the special case of our models, SimEE and SimEER achieve better results, especially on FB15K, and the filter MRR is up to 0.740. The results can prove that our models which use irrelevant and interconnected features to construct independent and associated spaces can represent the entities and relations features more comprehensively.



We use ttest to evaluate the effectiveness of our models, and the evaluation results can prove that on FB15K and FB40K, compared with other baselines, our results achieve significant improvements, e.g., on the Hits@ results of ComplEx and SimEER, which is larger than . The ttest results can prove that, on FB15K and FB40K, our experimental results achieve significant improvement compared with other baselines.
Relation Prediction. This set of experiments tests the models’ ability to predict relations. Tables 6, 7, and 8 show the prediction performance on WN18 and FB15K. From the tables, we discover the following:(i)Similar to the results in the entity prediction, on WN18, ComplEx achieves better results on MRR and Hits@, and SimEER obtains better results on Hits@ and Hits@. On FB15K, besides the value of Hits@, the results of SimEER are better than ComplEx and other baselines, and the value of Hits@ is up to 0.842, which is much higher (improvement of 20.1%) than the stateoftheart baselines. ON FB40K, SimEER achieves stateoftheart results on all the measures; in particular, the filter MRR is up to 0.603.(ii)In entity prediction task, the results of SimEE and SimEER are similar. However, in relation prediction tasks, SimEER achieves significant results on Raw MRR, Hits@, and Hits@. We use the ttest to verify the results, and the tvalues are larger than . The difference between entity and relation tasks can demonstrate that considering both entity and relation similarity can extract relation features more effectively on the basis of ensuring the entityfeatures extraction.(iii)On FB15K, the gap is significant and SimEE and SimEER outperform other models, with a MRR (Filter) of 0.593 and 0.842 of Hits@. On both datasets, CP and TransE perform the worst, which illustrates the feasibility of learning knowledge embedding in the first case and the power of using two mutual restraint parts to represent entities and relations in the second.



We also use ttest to evaluate our model; i.e., comparing SimEER with ComplEx on filter MRR, , which is larger than . The ttest results can prove that the performance of SimEER is better than other baselines on FB15K and FB40K.
To analyze the relation features, Table 9 shows the MRR with Filter of each relation on WN18, where denotes the number of triplets for each relation in the test set. From Table 9, we conclude the following:(i)For almost all relations on WN18, compared with other baselines, SimEE and SimEER achieve competitive results, which demonstrates that our methods can extract different types of latent relation features.(ii)Compared with SimEE, the relation MRRs of SimEER are much better on most relations, such as hypernym, hyponym, and derivationally_related_form.(iii)On almost all results of relation MRR, SimEER is better than DistMult, a special case of SimEER. That is to say, compared with single embedding space, using two different spaces to describe entity and relation, features can achieve better performance.

Case Study. Table 10 shows the detailed prediction results on test set of FB15K. It illustrates the performance of our models. Given head and tail entities, the top5 predicted relations and relative scores of SimEER are depicted in Table 10. From the table, we observe the following:(i)In triplet 1, the relation prediction result is ranked on top2, and in triplet 2, the result is top1. The relation prediction results can prove the performance of SimEER. However, in triplet 1, the correct result (top2) has similar score with other prediction results (top1, top3). That is to say, it is difficult for SimEER to distinguish similar relationships.(ii)For any relation prediction results, the top5 relation prediction results are similar; that is to say, similar relations have similar representation embeddings, which is in line with common sense.

4.4. Complexity Analysis
To compare the time and memoryspace complexity of different models, we show the analytical results of FB15K in Table 11, where represents the dimension of entity and relation space, “Minibatch” represents the minibatch of each iteration, “Params” denotes the number of parameters in each model on FB15K, and “Time” denotes the running time of each iteration. Note that all models are run on standard hardware of Inter(R) Core(TM) i7U 3.5GHz + GeForce GTX TITAN. We report the average running time over one hundred iterations as the running time of each iteration. From Table 11, we observe the following:(i)Except for DistMult, SimEE and SimEER have lower time and memory complexities compared with the baselines, because in SimEE and SimEER, we only use elementwise products between entities’ and relations’ vectors to generate the representation embedding.(ii)On FB15K, the time costs of SimEE and SimEER in each iteration are 5.37s and 6.63s, respectively, which are lower than 7.53s, the time cost of TransE which has fewer parameters. The reason is that the minibatch of TransE is 2415 which is much larger than the minibatches of SimEE and SimEER. Besides, for SimEE and SimEER, the number of iterations is 700 times with 3760 (s) and 4642 (s), respectively.(iii)Because SimEE and SimEER have low complexity and high accuracy, they can easily be applied to largescale knowledge graph, while using less computing resources and running time.

5. Conclusion
In this paper, we propose a novel similaritybased embedding model SimEER that extracts features from knowledge graph. SimEER considers that the similarity of the same entities (relations) is high in independent and associated spaces. Compared with other representation models, SimEER is more effective in extracting the entity (relation) features and represents entity and relation features more flexibly and comprehensively. Besides, SimEER has lower time and memory complexities, which indicates that it is applicable on largescale knowledge graphs. In experiments, our approach is evaluated on entity prediction and relation prediction tasks. The results prove that SimEER achieves stateoftheart performances. We will explore the following future work:(i)In addition to the facts in knowledge graph, there also are large amount of logic and hierarchical correlations between different facts. How to translate these hierarchical and logic information into lowdimensional vector space is an attractive and valuable problem.(ii)In real world, extracting relations and entities from largescale text information is an important yet open problem. Combining latent features of knowledge graph and text sets is a feasible method to construct the connection between structured and unstructured data. It is supposed to enhance the accuracy and efficiency of entity (relation) extraction.
Data Availability
All the datasets used in this paper are fully available without restriction upon request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work was partially supported by NSFC under Grants nos. 71690233 and 71331008.
References
 Y. Wang, N. Wang, and L. Zhou, “Keyword query expansion paradigm based on recommendation and interpretation in relational databases,” Scientific Programming, vol. 2017, 12 pages, 2017. View at: Publisher Site  Google Scholar
 A. Bordes, J. Weston, and N. Usunier, “Open question answering with weakly supervised embedding models,” in Machine Learning and Knowledge Discovery in Databases, pp. 165–180, Springer, 2014. View at: Publisher Site  Google Scholar
 A. Bordes, S. Chopra, and J. Weston, “Question Answering with Subgraph Embeddings,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP '14), A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 615–620, Doha, Qatar, October 2014. View at: Publisher Site  Google Scholar
 B. Han, L. Chen, and X. Tian, “Knowledge based collection selection for distributed information retrieval,” Information Processing & Management, vol. 54, no. 1, pp. 116–128, 2018. View at: Publisher Site  Google Scholar
 J. Berant, A. Chou, R. Frostig, and P. Liang, “Semantic parsing on freebase from questionanswer pairs,” in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP '13, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1533–1544, Seattle, Wash, USA, 2013. View at: Google Scholar
 S. Hakimov, S. A. Oto, and E. Dogdu, “Named entity recognition and disambiguation using linked data and graphbased centrality scoring,” in Proceedings of the 4th International Workshop on Semantic Web Information Management, SWIM'12, Scottsdale, Ariz, USA, May 2012. View at: Google Scholar
 J. Nikkilä, P. Törönen, S. Kaski, J. Venna, E. Castrén, and G. Wong, “Analysis and visualization of gene expression data using selforganizing maps,” Neural Networks, vol. 15, no. 89, pp. 953–966, 2002. View at: Publisher Site  Google Scholar
 L. C. Freeman, “Cliques, Galois lattices, and the structure of human social groups,” Social Networks, vol. 18, no. 3, pp. 173–187, 1996. View at: Publisher Site  Google Scholar
 P. P. Ray, “A survey on visual programming languages in internet of things,” Scientific Programming, vol. 2017, 6 pages, 2017. View at: Publisher Site  Google Scholar
 H. Tian and P. Liang, “Personalized Service Recommendation Based on Trust Relationship,” Scientific Programming, vol. 2017, pp. 1–8, 2017. View at: Publisher Site  Google Scholar
 G. A. Miller, “WordNet: a lexical database for English,” Communications of the ACM, vol. 38, no. 11, pp. 39–41, 1995. View at: Publisher Site  Google Scholar
 K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor, “Freebase: A collaboratively created graph database for structuring human knowledge,” in SIGMOD2008, pp. 1247–1249, 2008. View at: Google Scholar
 S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. G. Ives, “Dbpedia: A nucleus for a web of open data,” in Proceedings of the 6th International Semantic Web Conference, pp. 722–735, 2007. View at: Google Scholar
 F. M. Suchanek, G. Kasneci, and G. Weikum, “Yago: a core of semantic knowledge,” in Proceedings of the 16th International World Wide Web Conference (WWW '07), pp. 697–706, Alberta, Canada, May 2007. View at: Publisher Site  Google Scholar
 A. Carlson, J. Betteridge, B. Kisiel, B. Settles et al., “Toward an architecture for neverending language learning,” in Proceedings of the TwentyFourth AAAI Conference on Artificial Intelligence, AAAI '10, Atlanta, Ga, USA, 2010. View at: Google Scholar
 S. A. ElSheikh, M. Hosny, and M. Raafat, “Comment on ‘rough multisets and information multisystems’,” Advances in Decision Sciences, vol. 2017, 3 pages, 2017. View at: Publisher Site  Google Scholar  MathSciNet
 M. Richardson and P. Domingos, “Markov logic networks,” Machine Learning, vol. 62, no. 12, pp. 107–136, 2006. View at: Publisher Site  Google Scholar
 C. Kemp, J. B. Tenenbaum, T. L. Griffiths, T. Yamada, and N. Ueda, “Learning systems of concepts with an infinite relational model,” in AAAI2006, pp. 381–388, 2006. View at: Google Scholar
 Q. Wang, Z. Mao, B. Wang, and L. Guo, “Knowledge graph embedding: A survey of approaches and applications,” IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 12, pp. 2724–2743, 2017. View at: Publisher Site  Google Scholar
 Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A neural probabilistic language model,” Journal of Machine Learning Research, vol. 3, pp. 1137–1155, 2003. View at: Publisher Site  Google Scholar
 M. Nickel, V. Tresp, and H.P. Kriegel, “A threeway model for collective learning on multirelational data,” in Proceedings of the 28th International Conference on Machine Learning, ICML '11, pp. 809–816, July 2011. View at: Google Scholar
 J. Weston, A. Bordes, O. Yakhnenko, and N. Usunier, “Connecting language and knowledge bases with embedding models for relation extraction,” in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP '13, pp. 1366–1371, October 2013. View at: Google Scholar
 A. Bordes, N. Usunier, A. GarcaDurán, J. Weston, and O. Yakhnenko, “Translating embeddings for modeling multirelational data,” in NIPS2013, pp. 2787–2795, 2013. View at: Google Scholar
 L. Wondie and S. Kumar, “A joint representation of Renyi’s and Tsalli’s entropy with application in coding theory,” International Journal of Mathematics and Mathematical Sciences, vol. 2017, Article ID 2683293, 5 pages, 2017. View at: Publisher Site  Google Scholar
 W. Cui, Y. Xiao, H. Wang, Y. Song, S.W. Hwang, and W. Wang, “KBQA: Learning question answering over QA corpora and knowledge bases,” in Proceedings of the 43rd International Conference on Very Large Data Bases, VLDB '17, vol. 10, pp. 565–576, September 2017. View at: Google Scholar
 B. Yang and T. Mitchell, “Leveraging knowledge bases in lstms for improving machine reading,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1436–1446, Vancouver, Canada, July 2017. View at: Publisher Site  Google Scholar
 Q. Liu, H. Jiang, Z. Ling, S. Wei, and Y. Hu, “Probabilistic reasoning via deep learning: Neural association models,” CoRR, abs/1603.07704, 2016. View at: Google Scholar
 S. He, K. Liu, G. Ji, and J. Zhao, “Learning to represent knowledge graphs with gaussian embedding,” in Proceedings of the the 24th ACM International Conference on Information and Knowledge Management, pp. 623–632, Melbourne, Australia, October 2015. View at: Publisher Site  Google Scholar
 A. Bordes, J. Weston, R. Collobert, and Y. Bengio, “Learning structured embeddings of knowledge bases,” in AAAI2011, pp. 301–306, 2011. View at: Google Scholar
 R. Socher, D. Chen, C. D. Manning, and A. Y. Ng, “Reasoning with neural tensor networks for knowledge base completion,” in NIPS2013, pp. 926–934, 2013. View at: Google Scholar
 G. E. Hinton, S. Osindero, and Y. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527–1554, 2006. View at: Publisher Site  Google Scholar  MathSciNet
 A. Bordes, X. Glorot, J. Weston, and Y. Bengio, “Joint learning of words and meaning representations for opentext semantic parsing,” Journal of Machine Learning Research, vol. 22, pp. 127–135, 2012. View at: Google Scholar
 R. Jenatton, N. L. Roux, A. Bordes, and G. Obozinski, “A latent factor model for highly multirelational data,” in NIPS2012, pp. 3176–3184, 2012. View at: Google Scholar
 I. Sutskever, R. Salakhutdinov, and J. B. Tenenbaum, “Modelling relational data using Bayesian clustered tensor factorization,” in Proceedings of the 23rd Annual Conference on Neural Information Processing Systems (NIPS '09), pp. 1821–1828, British Columbia, Canada, December 2009. View at: Google Scholar
 R. Xie, Z. Liu, J. Jia, H. Luan, and M. Sun, “Representation learning of knowledge graphs with entity descriptions,” in Proceedings of the 30th AAAI Conference on Artificial Intelligence, AAAI '16, pp. 2659–2665, February 2016. View at: Google Scholar
 H. Xiao, M. Huang, and X. Zhu, “Transg: A Generative Model for Knowledge Graph Embedding,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 2316–2325, Berlin, Germany, August 2016. View at: Publisher Site  Google Scholar
 J. Feng, M. Huang, M. Wang, M. Zhou, Y. Hao, and X. Zhu, “Knowledge graph embedding by flexible translation,” in KR2016, pp. 557–560, 2016. View at: Google Scholar
 Y. Jia, Y. Wang, H. Lin, X. Jin, and X. Cheng, “Locally adaptive translation for knowledge graph embedding,” in AAAI2016, pp. 992–998, 2016. View at: Google Scholar
 T. Ebisu and R. Ichise, “Toruse: Knowledge graph embedding on a lie group,” CoRR, abs/1711.05435, 2017. View at: Google Scholar
 Z. Tan, X. Zhao, and W. Wang, “Representation Learning of LargeScale Knowledge Graphs via Entity Feature Combinations,” in Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1777–1786, Singapore, November 2017. View at: Publisher Site  Google Scholar
 B. Yang, W. Yih, X. He, J. Gao, and L. Deng, “Embedding entities and relations for learning and inference in knowledge bases,” CoRR, abs/1412.6575, 2014. View at: Google Scholar
 M. Nickel, L. Rosasco, and T. A. Poggio, “Holographic embeddings of knowledge graphs,” in AAAI2016, pp. 1955–1961, 2016. View at: Google Scholar
 T. Trouillon, J. Welbl, S. Riedel, E. Ciaussier, and G. Bouchard, “Complex embeddings for simple link prediction,” in Proceedings of the 33rd International Conference on Machine Learning, ICML '16, pp. 3021–3032, June 2016. View at: Google Scholar
 B. Shi and T. Weninger, “Proje: embedding projection for knowledge graph completion,” in Proceedings of the ThirtyFirst AAAI Conference on Artificial Intelligence, pp. 1236–1242, San Francisco, Calif, USA, 2017. View at: Google Scholar
 T. Dettmers, P. Minervini, P. Stenetorp, and S. Riedel, “Convolutional 2d knowledge graph embeddings,” CoRR, abs/1707.01476, 2017. View at: Google Scholar
 M. D. Zeiler, “ADADELTA: an adaptive learning rate method,” CoRR, abs/1212.5701, 2012. View at: Google Scholar
 Y. Lin, Z. Liu, M. Sun, Y. Liu, and X. Zhu, “Learning entity and relation embeddings for knowledge graph completion,” in Proceedings of the TwentyNinth AAAI Conference on Artificial Intelligence, pp. 2181–2187, 2015. View at: Google Scholar
 X. Dong, E. Gabrilovich, G. Heitz et al., “Knowledge vault: a webscale approach to probabilistic knowledge fusion,” in SIGKDD2014, pp. 601–610, 2014. View at: Publisher Site  Google Scholar
 J. Wu, Z. Wang, Y. Wu, L. Liu, S. Deng, and H. Huang, “A Tensor CP decomposition method for clustering heterogeneous information networks via stochastic gradient descent algorithms,” Scientific Programming, vol. 2017, Article ID 2803091, 13 pages, 2017. View at: Publisher Site  Google Scholar
 R. J. Rossi, A. Webster, H. Brightman, and H. Schneider, “Applied statistics for business and economics,” The American Statistician, vol. 47, no. 1, p. 76, 1993. View at: Publisher Site  Google Scholar
 D. Anderson, D. Sweeney, T. Williams, J. Camm, and J. Cochran, Statistics for Business & Economics, Cengage Learning, 2013.
 D. LibenNowell and J. Kleinberg, “The link prediction problem for social networks,” in Proceedings of the 2003 ACM CIKM International Conference on Information and Knowledge Management, pp. 556–559, New Orleans, La, USA, November 2003. View at: Publisher Site  Google Scholar
 M. A. Hasan and M. J. Zaki, “A survey of link prediction in social networks,” in Social Network Data Analytics, pp. 243–275, Springer, New Yok, NY, USA, 2011. View at: Google Scholar  MathSciNet
 C. Dai, L. Chen, and B. Li, “Link prediction based on sampling in complex networks,” Applied Intelligence, vol. 47, no. 1, pp. 1–12, 2017. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2018 Zhen Tan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.