Abstract

Knowledge graphs (KGs) entity typing aims to predict the potential types to an entity, that is, (entity, entity type=?). Recently, several embedding models are proposed for KG entity types prediction according to the existing typing information of the (entity, entity type) tuples in KGs. However, most of them unreasonably assume that all existing entity typing instances in KGs are completely correct, which ignore the nonnegligible entity type noises and may lead to potential errors for the downstream tasks. To address this problem, we propose ConfE, a novel confidence-aware embedding approach for modeling the (entity, entity type) tuples, which takes tuple confidence into consideration for learning better embeddings. Specifically, we learn the embeddings of entities and entity types in separate entity space and entity type space since they are different objects in KGs. We utilize an asymmetric matrix to specify the interaction of their embeddings and incorporate the tuple confidence as well. To make the tuple confidence more universal, we consider only the internal structural information in existing KGs. We evaluate our model on two tasks, including entity type noise detection and entity type prediction. The extensive experimental results in two public benchmark datasets (i.e., FB15kET and YAGO43kET) demonstrate that our proposed model outperforms all baselines on all tasks, which verify the effectiveness of ConfE in learning better embeddings on noisy KGs. The source code and data of this work can be obtained from https://github.com/swufenlp/ConfE.

1. Introduction

Knowledge graphs (KGs) consist of a huge amount of triples, each of which is formally denoted as (head entity, relation, tail entity) (or ). KGs are effective well-structural relational databases for knowledge acquisition. Beside the triples, KGs usually contain a great number of entity type instances in the form of (entity, entity type) (denoted by ) [1], which indicate that an entity is of a certain entity type . For example, an entity “Tom Hanks” is an instance of a type “actor.” As an essential part of KGs, they play an important role in KGs and have been widely used in some NLP tasks such as entity linking [2] and relation extraction [3], and question answering (QA) [4]. For instance, the KG-driven QA system could utilize the entity type information in a query: “Is Tom Hanks an actor ?

In recent years, many KGs have been built from semistructured data or free text, such as Freebase [5], YAGO [6], Knowledge Vault [7], and Google Knowledge Graph [7]. However, the large-scale knowledge graph automatic construction inevitably brings noises into KGs due to limited human supervision. For example, the open fine-grained entity typing system [8] even only achieves 58.8% accuracy, which reaffirms the existence of entity type noises in KG construction. Thus, the nonnegligible entity type noise problem extremely impedes the efficient use of KGs [9].

In this work, we focus on dealing with entity type noises located in existing KGs by learning entity type embeddings, which encodes all the entities and entity types into a latent vector space. Since the learning approach depends on the reliability of the existing (entity, entity type) tuples, it is crucial to consider the entity type noises for learning embeddings. There are some models proposed for KG entity type embedding learning [1012]. However, most of the learning methods unreasonably assume that all the existing entity type instances in KGs are true, which may lead to some potential errors for downstream entity type sensitive tasks. To address this issue, we propose ConfE, a novel confidence-aware embedding framework for entity type learning which takes the entity type noises into consideration. Figure 1 shows a simple illustration of our model ConfE, which learns entity type embeddings with the tuple confidence on noisy KGs. Such entity type noises are expected to be detected by ConfE and to be ignored in entity type embeddings learning.

Specifically, we build two different entity space and entity type space for learning the embeddings of entities and entity types since they are different objects in the tuple. We utilize a unique “rdf:type” relation matrix to specify the interaction of their embeddings, that is, , which incorporates the tuple confidence as well. To make the tuple confidence more universal, we only utilize the internal structural information, which makes it more challenging. Accordingly, we propose two kinds of tuple confidences that correspondingly consider the local tuple and global triple structural information in KGs. The extensive experimental results on two tasks including entity type noise detection and entity type prediction show that our model achieves the best performance, which demonstrates the effectiveness of ConfE in learning better entity type embeddings in a noisy scenario.

The main contributions are concluded as follows:(i)We propose ConfE, a novel confidence-aware embedding model for encoding the (entity, entity type) tuples to calculate the similarity of an entity and an entity type, which takes the tuple confidence into consideration.(ii)We build two distinguish tuple confidences according to the local and global structural information in existing KGs. The overall confidence of them is utilized in the final energy function for learning better embeddings.(iii)We conduct two experimental tasks including entity type noise detection and entity type prediction and utilize two public benchmark datasets (i.e., FB15kET and YAGO43kET) to verify the effectiveness of our model and the confidence-aware framework.

2.1. KG Noise Detection

In recent years, the research of KG noise detection attracts wide attention, also known as KG refinement [13]. The noise issue can be roughly classified into two classes, that is, false relationships between entities (head entity, relationship, tail entity) and false entity type instances (entity, entity type). Most of the existing research concentrates on the deal with the noisy triple facts in KG [9, 14, 1421]. For example, Jiang et al. [15] present a Markov logic-based system for cleaning an extracted knowledge base. Melo and Paulheim [17] propose an error detection method which relies on path and type features used by a classifier for every relation in the graph exploiting local feature selection. Neil et al. [18] introduce a regularized attention mechanism to GCNNs that not only improves performance on clean datasets but also favorably accommodates noise in KGs. Liang et al. [19] propose a method for graph-based wrong IsA relation detection in a large-scale lexical taxonomy. Pujara et al. [9] propose to improve the quality of knowledge graphs by removing errors and adding missing facts. Xie et al. [20] propose a confidence-aware knowledge representation learning framework that detects possible noises in KGs while learning knowledge representations with confidence simultaneously. Zhao et al. [21] propose a trustiness-aware method for KG noise detection. Despite their success, which focuses on detecting triple fact noises, their goals are different from this paper.

There are a few models of dealing with entity typing noises [22]. However, they mainly concentrate on association rule mining [23], heuristic link-based type inference [24]; therefore, they are constrained by the capability of generalization. Recently, Ren et al. [25] propose a heterogeneous partial-label embedding model for label noise detection. Templemeier et al. [26] propose an approach to predict the missing categories for particular entities that are obtained from noisy and sparse Web markup. Despite their success, their goals are different from KG entity type noises detection, and none of them consider the confidence of the entity type tuples. In this work, we concentrate on knowledge graph entity type noise detection and learn better entity type embeddings with confidence in a noisy scenario. The model illustration of KG noise detection is included in Table 1.

2.2. KG Embedding

Recently, KG embedding has become a hot topic in AI and NLP research field [27]. Most of the existing embedding models concentrate on learning the (head entity, relationship, tail entity) triples, such as SE [28], NTN [29], TransE [30], TransH [31], TransR [32], TransG [33], ComplexE [34], SSP [35], ProjE [36], ConvE [37], KBGAT [38], CapsE [39], and ConvKB [40], which pay less attention to the exploration of embedding the (entity, entity type) tuples. Recently, Neelakantan and Chang [10] propose a method to infer missing entity type instances, where they embedding the tuple by . However, they also use external information from Wikipedia besides the information within the existing KG. Moon et al. [11] propose an embedding approach for entity type embedding (ETE), in which they build the energy function as . Despite their success, they are lacking enough modeling capability due to their structural simplicity. In this work, we introduce an advanced embedding model with better expressive capability, which considers the structural information of both the (entity, entity type) tuples and the (head entity, relationship, tail entity) triples in KGs.

3. Methodology

To detect possible entity type noises in KGs and learn better entity type representations, we introduce a novel concept tuple confidence for each (entity, entity type) tuple. Tuple confidence describes the correctness and significance of a tuple, which could be measured according to local tuple and global triple information. The novelty of this work is to model the confidence of entity type instances for typing noise detection and propose an embedding method to model tuples . In the following, we first present the confidence-aware embedding learning framework and then describe the embedding model and the methods for calculating the tuple confidences.

3.1. Confidence-Aware Embedding Learning Framework

We intend to detect entity type noises and learn better entity type embeddings that take tuple confidence into consideration. Our ConfE model should concentrate more on those tuples with higher confidence. Similar to [20], we formally design the energy function of a tuple as follows:where denotes the set of all tuples in KGs. The energy function consists of two parts: (i) denotes the model score of the tuple (more details are included in Section 3.2), which assigns for an asymmetric matrix that specifies the interaction of the latent presentation of entity and entity type. A higher indicates better interaction between the latent embeddings of entity and entity type in the tuple. (ii) While different from conventional methods, we propose tuple confidence in the framework. stands for the overall tuple confidence of the tuple (Section 3.3), whose value comes higher when the current tuple is worth considering. Higher tuple confidence implies that the corresponding tuple is more credible and thus should be more considered. Tuple confidence can be calculated both during and after KG construction from different aspects including internal knowledge in KG (such as topological information) and external information (such as textual data). To make our tuple confidence more universal and flexible, we only consider the KG structural information. Accordingly, we propose local and global tuple confidence that are learned iteratively during model training.

3.2. Model Optimization

Following [30], we utilize the margin-based ranking loss function to train our model ConfE. The main idea is that each tuple in the training set should receive a higher score than a corrupt tuple in which a type is replaced with a random entity type. The ranking loss function is defined as follows:where and represent the model score of positive triple and negative triple, respectively. The tuple confidencemakes our algorithm learning more on those convincing tuples with higher confidence. is a hyperparameter for distinguishing positive instance and negative one. is a set of corrupt tuples built in the following way:

Note that we do not replace both entity and entity type with a random one at the same time.

3.3. Embedding Model

We introduce the embedding model of a tuple in this section. Similar to [11], we treat the tuples as triple facts that only have a unique relationship “rdf:type”, for example, (Tom Hanks, rdf:type, actor). Accordingly, we assign for the “rdf:type” relationship an asymmetric matrix that specifies the interaction of the latent presentation of entity and entity type, which is inspired by the previous embedding model RESCAL [41]. Formally, the embedding model of a given tuple is designed as follows:where , , and are the set of entities and entity types, respectively. Different from the conventional methods that encode entities and types into a common space, we build two distinct latent vector spaces for them, that is, entity space and entity type space, since the entities and entity types are different objects in KGs. stands for the representation of an entity in entity space and is the representation of a type in entity type space. denotes the asymmetric matrix. Since the representations of entity types indicate the common knowledge of all their entities, therefore, they usually have fewer parameters, that is, . The model score is expected to be higher for a positive tuple and lower for a negative one.

3.4. Tuple Confidence

In this section, we will introduce the detailed methods of calculating the tuple confidence, which consists of two parts: (i) local tuple confidence, which only considers the inside structural information of a tuple, and (ii) global triple confidence, which considers the global triple information in KGs.

3.4.1. Local Tuple Confidence

We first come up with local tuple confidence which only concentrates on the inside of a tuple. We assume that the more a tuple fits the interaction assumption, the more convincing this tuple should be considered. The basic idea behind it is that the model score of the positive tuple should be higher than the negative one. We believe that the more the value of the margin-based objective function, the more convincing the tuple should be considered in training. To measure the local tuple confidence during training, we first judge the current conformity of each tuple with interaction assumption. Inspire by the margin-based training strategy, we directly utilize it to represent the local tuple quality as follows:

A higher usually indicates a better tuple judged by the interaction assumption. Hence, the local tuple confidence changes with its corresponding tuple quality , which is formally built as follows:

We assume all given tuples are true and set  = 1 at the beginning, which would be continuously updated during training. and are hyperparameters that control the speed of when updated descendingly and ascendingly, respectively. If , it indicates that the interaction between the entity and entity type performs poorly, and thus the local tuple confidence should decrease; otherwise, it should increase it. The local tuple confidence will decrease at a geometric rate and increase with a constant addition. It urges to punish the violations of interaction rule for those tuples which are more likely to be noises; therefore, they should have smaller confidences.

3.4.2. Global Tuple Confidence

Despite the success of LC, it only concentrates on the inside of tuples, ignoring valuable triple facts in KGs. We observe that the relational triple information is also helpful to judge tuple qualities. Inspired by the work in [1], we first build the entity type triple (head type, relationship, tail type) by replacing both head entity and tail entity with their corresponding entity types, that is, , using two entity type tuples and . The main idea behind it is that a significant premise of a triple holds is that their corresponding entity types should obey their relationship. Accordingly, we utilize the translating assumption [30] to model the entity type triples, that is, . We believe that the more an entity type triple fits the translation assumption, the more convincing this entity type tuple should be considered. Therefore, we calculate the global triple quality of an entity type tuple as follows:where , denotes the set of positive triple facts in KGs. is a random negative entity type. is a hyperparameter. Hence, the global tuple confidence can be learned during training as follows:

Here, the iterative learning process of is similar to . We assume all tuples are true and  = 1 at the beginning, which are continuously updated during training. and are hyperparameters that control the speed of updating.

3.4.3. Overall Tuple Confidence

To the end, we build overall tuple confidence for confidence-aware energy function. The overall tuple confidence consists of the following two parts: (i) local tuple confidence and (2) global tuple confidence , which is formally designed as follows:where is a parameter for trade-off.

4. Experiments

In this section, we evaluate the effectiveness of ConfE on entity type noise detection and entity type prediction.

4.1. Datasets

The two public benchmark datasets for the experiments are directly taken from [42]. The basic statistics of the datasets are in Table 2. Specifically, FB15k [30] and YAGO43k [11] are extracted from Freebase [5] and YAGO [6], respectively. We utilize entity type datasets called FB15kET and YAGO43kET built in [11], which are composed of tuples, in which the entity types are mapped to entities from FB15k and YAGO43k, respectively. Three noisy datasets containing tuples like or based on the training set of FB15kET are built in which the noises are 10%, 20%, and 40%, that is, FB15kET-N1, FB15kET-N2, and FB15kET-N3. Similarly, the noisy datasets YAGO43kET-N1, YAGO43kET-N2, and YAGO43kET-N3 are built based on YAGO43kET.

4.2. Baselines and Configurations

The parameters we trained for our model are as follows: , , the hyperparameters of the margin: {1, 3, 5, 7, 10}; -learning rate: {0.1, 0.01, 0.001, 0.0001}; , the embedding dimensions of entity and entity type: {(50, 30), (100, 50), (150, 100), (200, 150), (200, 200)}; -descend controller: {0.96, 0.97, 0.98, 0.99}; -ascend controller: {0.001, 0.002, 0.005}; and -combination weights: {0.1, 0.2, 0.3, 0.7, 0.8, 0.9}.

To confirm the best value of parameters, we train ConfE on the validation dataset. The optimal combination of parameters setting for ConfE is on FB15kET; and on YAGO43kET. Moreover, we utilize the TransE [30] model to initialize the embeddings of entities and relationships.

We compare our model with the recent baselines: ETE [11], TransE-ET [11], TrustE(LT) [42], and TrustE(LT + GT) [42]. The results of all baselines are directly taken from Zhao et al. [42].

4.3. Entity Type Noise Detection

In this experiment, we conduct entity type noise detection, that is, detecting possible noisy entity types according to their tuple scores.Evaluation protocol: We consider the model score: for entity type tuple. Similar to [29], we rank all (entity, entity type) tuples in the noisy training set by their scores in descending order. Therefore, the tuples with lower ranking would more likely be noisy ones. We utilize the precision/recall curves to demonstrate the effectiveness of our model.Experimental results: Figures 2 and 3 show the performances of all models on entity type noise detection, from which we can find that (i) On FB15kET and YAGO43kET, our ConfE model achieves the best performance under different noise rates, which confirms that ConfE could effectively and competently detect entity type noises in KGs. As the recall increases, the improvement introduced by our ConfE model over the baseline grows more insignificant, which reaffirms that the noises greatly impede entity type noise detection. (ii) Compared to YAGO43kET, the ConfE model seems to perform more significantly on FB15kET. Considering that there are 37 relations in YAGO43kET while 1345 in FB15kET, the sparseness of relationships harms the effectiveness of the type-relation-type training set. Such sparseness causes a relation to be connected to too much entity type so that the embedding of relation may not be capable of accurately describing its internal connection with different entity types. The results also verify the effectiveness and robustness of our model in both scenarios.

4.4. Entity Type Prediction

This task aims to verify the effectiveness of the ConfE model in entity type prediction, that is, completing the missing entity type tuple (entity, entity type=?).Evaluation protocol: For each tuple, we first remove its entity type and fill the resulting vacancies with all the entity types in turn as candidate tuples. Secondly, we compute the score of each candidate tuple based on the function and rank them in descending order. Then, we can get the rank of the original tuple. Finally, we use (1) the mean reciprocal rank (MRR) and (2) the proportion of correct entity types ranked in the top 10 (HITS@10(%)) as evaluation metrics for comparison. We follow the method utilized in [30] to define evaluation settings of “Raw” and “Filter”:where is the collection of all testing (entity, entity type) tuples and is the rank location of the true candidate tuple for the -th pair.Experimental results: Tables 3 and 4 show the result of all models on entity type prediction, from which we could observe that (i) ConfE consistently and significantly performs better than the baselines on FB15kET noisy testing datasets with all evaluation metrics. It reaffirms the quality of knowledge embedding in our ConfE model, which is also helpful for both KG entity type prediction and entity type noise detection. (ii) Our ConfE model outperforms on MRR in YAGO43kET noisy testing datasets in “raw” setting. Compared with HITS@10, MRR places more importance on the average ranking of the original tuple. We guess that although ConfE may be not as good as baselines, it also has considerable advantages in improving the average prediction accuracy. (iii) In the setting of “filter”, ConfE performs better on HITS@10 and has a comparable performance on MRR, which confirms the capability of entity type prediction. Moreover, our model has stronger adaptability in large-scale data modeling than other state-of-art models.

5. Conclusion and Future Work

We propose a novel confidence-aware embedding framework (ConfE) for KG entity typing on a noisy knowledge graph which takes the (entity, entity type) tuple confidence into consideration. Specifically, we build a bilinear embedding model to model the (entity, entity type) tuple. Moreover, we calculate the tuple confidence by considering the internal structural information in KGs. We evaluate our models on two experiments including entity type noise detection and entity type prediction. Empirical experiment results on FB15kET and YAGO43kET demonstrate the effectiveness of the proposed ConfE model in entity type noise detection. Interesting future work direction includes exploring to detect noises in entity type instances and entity type triples simultaneously.

Data Availability

The data are available upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Yu Zhao and Jiayue Hou contributed equally.

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant no. 61906159, the Sichuan Science and Technology Program under Grant no. 2018JY0607, the Fundamental Research Funds for the Central Universities under Grant no. JBK2003008, and Fintech Innovation Center, and Financial Intelligence and Financial Engineering Key Laboratory of Sichuan Province.