Abstract

The calculation of formal concepts is a very important part in the theory of formal concept analysis (FCA); however, within the framework of FCA, computing all formal concepts is the main challenge because of its exponential complexity and difficulty in visualizing the calculating process. With the basic idea of Depth First Search, this paper presents a visualization algorithm by the attribute topology of formal context. Limited by the constraints and calculation rules, all concepts are achieved by the visualization global formal concepts searching, based on the topology degenerated with the fixed start and end points, without repetition and omission. This method makes the calculation of formal concepts precise and easy to operate and reflects the integrity of the algorithm, which enables it to be suitable for visualization analysis.

1. Introduction

FCA, a branch of lattice theory, presented by Ganter and Wille [1] in 1982, is a discipline that concerns the mathematical concepts and conceptual hierarchies. As a powerful tool for data analysis and knowledge processing, FCA has been widely applied in date mining [2, 3], web search [4, 5], software engineering [6, 7], ontology analysis [8, 9], and so forth and still has a great potential value in applications.

Computing all formal concepts and the concept lattice is the most basic issue that has been investigated by many domestic and foreign researchers from different angles. Various forms of algorithms in concepts computing and concept lattice generation can be generally divided into three categories: batch processing algorithm [1014], progressive algorithm [15, 16], and parallel algorithms [17, 18]. The basic idea of batch processing algorithm is to form the edges according to the presubsequent relationship among all concepts generated and then constructs the concept lattice. Progressive algorithm intends to initialize the concept lattice to empty firstly and then updates the concepts according to the different results of the intersection operation between the new concept which is to join to the lattice and all the concepts of the current concept lattice. Parallel algorithm intends to split the formal context into some subcontexts, based on which to construct sublattices, and then makes the appropriate merge operations. Besides, progressive algorithm [19, 20] and parallel algorithms [21, 22] work very effectively in the calculation of concepts.

However, the structure of concept lattice is relatively complex and the relationship among attributes of formal context cannot be visually represented. Attribute topology [2325] presented by Zhang et al. is a new method to represent the formal context. Different with the traditional representation of formal context, attribute topology is represented by the weighted graph, with the attributes as the vertices and inclusion relation between attributes as the weights, which performs the coupling relationship and the coupling strength between attributes intuitively. Attribute topology, providing a new method to represent formal context, not only performs the association and the strength of association simply and intuitively but also has a one-to-one relationship with formal context [23].

Based on the new representation, Zhang et al. present a calculation algorithm of formal concepts by attribute topology [24]. The method, firstly, intends to construct the subtopology successively according to the number of connections with the top-attributes, as the core of the subtopology, respectively, with the order from more to less. Then, all extensions in every subtopology are achieved after the arrangement and operation of all the available object sets according to the association and correlation intensity, and all concepts in every subtopology are generated with the corresponding intensions, that is, all concepts in the formal context. This method calculating concept by the attribute topology not only provides a new idea for the calculation of formal concepts but also enables the calculation simple and easy to operate. However, the method is not suitable for visualization analysis due to the subtopology that is achieved by splitting the whole topology. Secondly, it is not available for the concepts’ calculation in the formal context of large-scale date due to the poor visibility.

Global formal concepts searching of attribute topology is proposed in this paper with the basic idea of Depth First Search. This method, firstly, adds two vertices, global start and end points, and edges to the original attribute topology, degenerating it into the topology with the fixed start and end points and then, limited by the constraints and calculation rules, explores and backtracks the vertices, sorted formerly, repetitively, completing the traversal of paths. All formal concepts are achieved by the path traversing between the global start and end points. This method constructs the topology into a complete whole, avoiding the decomposition process of the whole topology, and reflects the integrity of the algorithm. Calculation process is demonstrated intuitively in the process of traversing paths, reflecting the good visibility.

2. Basic Notions

2.1. The Basis of Formal Context

Formal context, which acts as the research object and date representation, is an important basic aspect in FCA. Here are a few notions about formal context [1].

Definition 1. A formal context can be expressed as which is composed of two sets and and the relation between them. Where is a set of objects, is a set of attributes, and is an object-attribute relation.

Definition 2. Let be a formal context, ; let
If and , then the pair is a formal concept. The sets and are the extension and intension of , respectively. The set of all concepts in is denoted by or .

Definition 3. The concept, with the whole attributes as the intension and the corresponding object set as the extension (called global concept of whole attribute) or the whole objects as the extension and the corresponding attribute set as the intension (called global concept of whole object), is called global concept. There exist only two global concepts in , that is, and .
Concept lattice, the core date structure of FCA, is a complete lattice denoted by all concepts and the generalization-specialization relationships between them. Hasse diagram, equipped with the partial order of concept lattice simply and effectively, is the best way and common method to represent the concept lattice, which can express the relationships between all concepts intuitively and integrally.
Every vertex in Hasse diagram is a concept and all vertices respect the whole concepts in formal context. The entire Hasse diagram is constructed by all the concepts together with the order inclusion relation between two formal concepts, and each layer is arranged in descending order of the extension (ascending order of the intension).
Formal contexts mentioned in this paper are all simplified contexts [2325].

2.2. The Basis of Attribute Topology

From the perspective of graph theory, the attribute topological representation is the weighted graph which concerns the relationship among attributes. So it can use the storage method of graph, that is, adjacency matrix sequence. In [23, 24], attribute topology is represented by adjacency matrix from the perspective of inclusion relation of attributes.

Suppose , is the attribute topological representation for formal context . is the set of vertices in attribute topology. Edge is the set of weights on edges in attribute topology.

In this paper, adjacency matrix, induced in [23, 24], is streamlined, and the adjacency matrix modified is

Here are a few notions about attribute topology [2225].

Definition 4. Global attribute is the attribute that possess all objects. In the formal context , if and , then attribute is a global attribute. Correspondingly, global object is the object that possesses all attributes.

Definition 5. Empty attribute is the attribute that does not possess any object in formal context. In , if and , then attribute is an empty attribute. Correspondingly, empty object is the object that does not possess any attribute.
According to the lattice theory, the global objects and global attributes only emerge on the top and at the bottom of the concept lattice, and they will not influence the structure of the concept lattice, so the global objects and global attributes can be reduced in concept lattice [24]. Empty objects and empty attributes which have no association with any other attributes and objects are independent existence that have no influence on the process of calculating concepts.

Definition 6. In attribute topology, if any satisfying , then attribute is called top-attribute [21].
Edges connected to the top-attributes in attribute topology are bidirectional or unidirectional pointing to the outside.

Definition 7. In , , let be a nonempty set, for satisfying ; then attribute is the child-attribute of attribute and attribute is the father-attribute of attribute [25].

Theorem 8. In , if and for , satisfying , that is, attribute is the child-attribute of some attributes; attribute may possess several father-attributes.
is induced by all father-attributes of child-attribute.

Theorem 9. A top-attribute is definitely not the child-attribute.

3. Global Formal Concepts Searching of Attribute Topology

Global formal concepts searching of attribute topology are induced in the paper in order to construct the relationship between concepts while calculating all the concepts. Limited by the constraints and calculation rules, a series of paths are formed through exploring and backtracking the vertices of attributes sorted formerly while computing all formal concepts, in which removes the relevant edges of topology simultaneously, based on the topology degenerated with the fixed start and end points.

3.1. Degeneration of Attribute Topology

According to [25], all attributes in formal context are divided into the top-attributes and child-attributes. Meanwhile, [24] puts the top-attributes as the core to decompose the topology into some subtopologies.

This paper, firstly, establishes the global attribute and the empty attribute from the global point of view. Then, without changing the basic structure of the topology, the topology model, possessing the global start and end points, is constructed. This model, denoted by topological degeneration model, represents the degeneration of topology.

Topological degeneration model mentioned above is described as follows: vertices and () are added to the original topology, with as the global start point and as the global end point. Suppose that sets , for , generate the unidirectional edges and let ; for , let , where End is the terminal symbol and is represented by unidirectional edge when drawing in order to describe the model uniformly.

The establishment of the topological degeneration model can be divided into the following two cases.

(1) No Existence of Child-Attributes. Then let and ; that is, for , let , .

(2) Existence of Child-Attributes. Let be the set of top-attributes and the set of child-attributes; that is, and .

The topological degeneration model is established by the following example: Table 1 shows a formal context that contains child-attributes. The attribute topology of the formal context listed in Table 1 is presented in Figure 1.

In the formal context (see Table 1), is the set of top-attributes and is the set of child-attributes. The degeneration of the topology (see Figure 1) is shown as follows: vertices and are added to the original topology and let , , then construct the unidirectional edges from to every element in set and unidirectional edges from every element in set to . The topology degenerated is presented in Figure 2.

The degeneration of the topology (see Figure 2) is presented by the newly added vertices , , and the dashed lines with an arrow, according to the comparison of Figures 1 and 2. A complete whole of topology is achieved after the degeneration, by constructing the unidirectional edges from global start point to related vertices and unidirectional edges from related vertices to global end point.

The degeneration of the topology (see Figure 2) that contains child-attributes is actually achieved by constructing the unidirectional edges from global start point to all top-attributes and unidirectional edges from all child-attributes to the global end point, while, in terms of the topology without child-attributes, its degeneration is realized by constructing the unidirectional edges from global start point to all attributes and unidirectional edges from all attributes to the global end point.

Above analysis shows that the added two vertices and edges have no influence on the original association and correlation intensity between attributes. That is, the structure of the original topology has no changes, still containing the association and correlation intensity between all attributes and objects. The association and correlation intensity required in calculating the concepts are not damaged, and subsequent calculation of concepts based on the degenerated topology has not been affected.

From the perspective of graph theory, a complete diagram with fixed start and end points, suitable for the following path searching, is achieved by the degeneration.

3.2. Algorithm Description

The traversal on path, the case of exploring and backtracking vertices, is fundamentally carried from the global start point and ends with traversing all paths between and . The certificates for exploring and backtracking vertices are provided by the constraints and calculation rules.

3.2.1. Preliminaries

(1) Sorted Vertices. Suppose that is the set of all attributes in attribute topology, the set of all top-attributes , and the set of all child-attributes .

For , let or . represents the number of elements in the collection.

Definition 10. Let be a nonempty set and . A mapping satisfies the following:(1);(2).

According to Definition 10, , an ordered set, is the result of reordering all the elements in set . is the result of reordering elements in by the same ruler.

Let ; that is, .

According to the above analysis, is an ordered set after reordering the set of all attributes added to the global start and end points. The subsequent exploring and backtracking of vertices are based on , in which the global start point is followed by a series of ordered top-attributes followed by ordered child-attributes and the global end point in the last.

In the formal context shown in Table 1, , the set of top-attributes , and set of child-attributes . According to the mapping , , , and .

(2) Improved Description of Child-Attribute. In order to calculate concepts, some related properties of child-attribute are introduced in this section.

Definition 11. (empty or nonempty) is a set that follows the following:(1) is a child-attribute;(2) or ;(3),
where holds the following conditions:(1) is a nonset of attributes;(2);(3)for , where is a nonempty set of objects corresponding to ;(4)for , then ;(5)if , for , satisfying and .

Theorem 12. If , for , must be a complete polygon.

Proof. For , , then For , , then Also combined with Formulae (4) and (5): Also Simultaneous Formulae (6) and (7) will get combined with Formulae (4) and (8): So is a complete polygon.

Theorem 13. If  , for , , satisfying .

Proof. According to Definition 11, for , , there exists , respectively, and : for ;similarly, for ;then for : that is, similarly, for ;that is, .

Theorem 14. For , satisfying .

Proof. Suppose ; that is, there exists element satisfying and: since ,so according to Definition 11, obviously, Formulae (12) and (13) are mutually contradictory, so the assumption is not valid.So .

In the formal context presented in Table 1, , are the father-attributes of child-attributes and , respectively.

As Definition 11 shows , while is provided with , , ; and .

3.2.2. Exploration of Vertex

(1) Representation of the Path. Suppose the set of all attributes in attribute topology is ; that is, .

Definition 15. is triple which satisfies the following properties:(1);(2);(3),where satisfies the following:(1), ,(2),(3),(4), ,(5), ,(6), .

As shown in Definition 15, is uniquely determined by its magnitude and direction if . Its magnitude is expressed by where and its direction is expressed by .

According to the above analysis, the path can be represented by : the attributes successively passed by current path, and , are recorded by . There exists unidirectional edge between each two adjacent vertices; that is, edges passed by the current path are successively. Let be the weights on the edge .

Suppose adding a new vertex to the existing path; then . Update on the path: a new vertex and a new edge are generated; that is, the attributes successively passed by the path are and the weights on the edge are .

Theorem 16. must be a complete polygon in attribute topology if .

Proof. Consider the following: Then for , combined with Formulae (15) and (16), for , So must be a complete polygon.

Lemma 17. not be the intension certainly if .

Proof. Consider the following: Then so To prove that is the intension of a concept, satisfying Formulae (1) is needed; that is, since in this formal context, combined with Formulae (22) and (23): Obviously, Formulae (21) and (24) are mutually contradictory; that is, is not the intension.

Lemma 18. must be the intension of a concept if and

Proof. Consider the following: and then since so for , According to Formulae (1), to prove the intension of a formal concept, it is only needed to satisfy . From the relationship of formal context, if ; then there must be where and , and then Also Simultaneous Formulae (30) and (31) will get Combined with Formulae (27) and (32): apparently the result is wrong.
So So is the intension of a formal concept.

(2) Process of Depth Exploration of Vertex. Depth exploration of vertices in topology is essentially the process of exploring the adjacent points through the arcs between vertices. Suppose the set of all attributes of formal context is , and the sets and are the sets of top-attributes and child-attributes, respectively. According to Section 3.2.1(1), the set of vertices sorted is , based on which to explore vertices.

This algorithm begins with the first element, that is, start point , and then explores the subsequent elements successively. For the sorted set , .

Suppose ; is the current path and ordered set is the set of attributes which passed by the path successively. Suppose that attribute is the current attribute explored, and the process of exploring and the constrains of exploring are plotted in Figure 3.

As seen in Figure 3, when exploring the attribute , constraints are shown as follows:(1), , or ;(2) is a child-attribute;(3);(4);(5)for , satisfying , or ;(6).

Figure 3 displays that the current path is updated, that is, and , when it meets any one of the following conditions, that is, the constrains of exploring:(a);(b);(c).

It is needed to traverse the attribute next to attribute , , when it meets the condition , that is, not satisfying the constrains of exploring.

Then we analyze the exploring process through an example. For the formal context shown in Table 1, .

This algorithm begins with the start point, that is, , ; current attribute explored . Attribute satisfies the following conditions: is not the child-attribute; ; that is, attribute meets the above condition (a). The current path is updated, ; the weights on the newly added edge are ; and the direction of the path updated is , that is, from to .

The process above can be demonstrated by Figure 4.

Figure 4 displays that the path : is presented as the arrow shown; , as the weights, is marked on the newly generated edge.

Additionally, suppose ; is the current path, and current attribute explored . Attribute satisfies the following conditions:(1), ;(2) is a child-attribute;(3);(4), ;(5), .

That is, satisfies the above condition ; then it explores the attribute next to attribute : when the current attribute explored is updated from to , the current path does not update.

(3) Updated Data in Process of Depth Exploration of Vertex. From the previous section analysis, the path is updated when the explored attribute meets the constrains of exploring and adds to the current path. There are a series of data to be updated, mainly in the following two aspects.

(1) Updated Set of Concepts . Suppose the set of all attributes of formal context is , , . Let the set of attribute category be and the category of attribute be , that is, . For , is initialized to 0. The set of concepts is initialized to .

Suppose the current path is , , , and the current set of concepts is , and for , , that is, .

Suppose the current attribute explored satisfies the constrains of exploring shown in previous section and the path is updated, that is, , , .

Simultaneously, the set is updated.

If , let the newly generated two-tuples . That is, the set is updated to or , ; .

If , , that is, .

(2) Updated Attribute Topology. If , then let ; that is, the edge (unidirectional or bidirectional), between and , is removed from the attribute topology.

If , the attribute remains unchanged.

On the contrary, updates on the two aspects described above will not be induced if the current attribute explored does not meet the constrains of exploring shown in previous section.

Then we analyze the date updated through an example.

For the formal context shown in Table 1, suppose the current path is and , , which is presented in Figure 4. Suppose the current attribute explores . Attribute meets the constrains of exploring shown in previous section and path is updated: , , and . The updated path is demonstrated by Figure 5, and for convenience of description, the newly generated edge is presented by the route without an arrow.

The updated data is shown as follows.

Updated Set . and then adds the newly generated two-tuples to the set , that is, , . The update on the set of concepts , along with the update on path, is shown in Table 2.

Updated Attribute Topology. Suppose . Figure 2 presents that which satisfies the condition , so let ; that is, the bidirectional edge, between and , is removed from the attribute topology.

Additionally, Suppose , listed in Figure 6, is the current path and , .

Suppose the current attribute explored . Attribute satisfies the constrains of exploring and then the path is updated: , . The updated path is presented in Figure 7.

The updated data is shown as the following.

Updated Set . Suppose . Then the element in set is replaced with the newly generated two-tuples ; that is, the number of elements in set remains unchanged: .

Table 3 lists each update on set , along with each update on path, from path to the current path .

Updated Attribute Topology. Suppose . Figure 2 presents that which satisfies the condition , so the attribute topology remains unchanged.

3.2.3. Backtracking of Vertex

Suppose the set of all attributes of formal context is , , and the current path is , , .

Backtracking the vertex occurs if the current attribute explored satisfies any of the following two conditions, that is, constrains of backtracking:(1);(2).

Suppose the current path is , and . When the current attribute explored satisfies the constrains of backtracking, backtracking the vertex: , that is, , , , , and set remains unchanged.

Combined with the description above, the process of backtracking vertices is listed in Figure 8. represents the last element of the order set.

The process of backtracking vertices is finished when it meets the condition , according to Figure 8.

Then, the process of backtracking vertices is illustrated by the example.

Suppose the current path is shown in Figure 7. According to Section 3.2.2, , .

Suppose the current attribute explored , which satisfies the constrains of backtracking, then it backtracks the vertex, that is, , , .

Then a judge on is achieved, presented in Figure 9. Due to the fact that satisfies the constrains of backtracking, backtracking of vertex is achieved again, that is, , , and .

Repeating the judge on , and , which leads to the completion of the backtracking vertices. Then the current attribute explored is updated to and the set remains unchanged. The updated path is shown in Figure 9.

3.2.4. Discussion

Flowchart of the complete algorithm is listed in Figure 10.

As seen in Figure 10, the process of exploring is achieved when the backtracking process is finished. In the process of depth exploration, the process of backtracking is achieved if it satisfies the constrains of backtracking. That is, all paths between the two fixed points which meet related constrains are achieved in the process of repeating, exploring, and backtracking vertices. The update on the set of concepts and attribute topology are induced in the process of traversing the path.

Figure 10 presents that the algorithm ends when it meets the condition .

The update on set and attribute topology, after the end of the algorithm, are demonstrated, respectively, by the example presented above:(1)updated topology is shown in Figure 11;(2)updated set .

Each two-tuples is updated to due to the fact that the start point is not of practical significance. Then every element in set is a concept; that is, in formal context .

The traversal process of all paths between the start and end points is illustrated by the tree structure shown in Figure 12.

Figure 12 displays that each path begins with and ends with a leaf node. Each path is generated from top to bottom and all paths are formed from left to right.

Any node in Figure 12 can be seen as a concept, with the set of all attributes passed by the current path in which current node located, from to node , as the intension and the weights on the edge, whose lower end is connected to the current node , as the extension. The node presents that attribute is also passed by the current path where node located.

As seen in Figure 13, the concepts represented by nodes in Figure 12 are indicated on the corresponding nodes and generated a concept tree. Figure 13 displays all concepts except the global concept .

Figure 14 shows the Hasse diagram of the formal context plotted in Table 1 and each node represents a formal concept.

The concepts successively represented by c(0)–c(47) are (012345678,ø), (01345678,g), (0123478,f), (013478,gf), (04568,gd), (048,gfd), (02356,c), (0356,cg), (026,ce), (056,cgd), (06,cgde), (234568,b), (34568,bg), (2348,bf), (348,bgf), (4568,bgd), (48,bdfg), (2356,bc), (356,bcg), (6,bce), (56,bcgd), (6,bcgde), (023578,a), (03578,ag), (02378,af), (0378,agf), (058,agd), (08,agfd), (0235,ac), (035,acg), (023,acf), (03,acgf), (02,acfe), (05,acgd), (0,acgfde), (2358,ab), (258,abg), (238,abf), (38,abgf), (58,abgd), (8,abgfd), (235,abc), (35,abcg), (23,abcf), (3,abcgf), (2,abcgfe), (5,abcgd), and (ø,abcdefg), that is, all concepts in the formal context.

Under the constrains and calculation rules, the process of computing all formal concepts is achieved by traversing vertices successively. The path is formed by the traversed vertices and all concepts are induced in the process of traversing all paths. As shown in Figures 12 and 13, the process of traversing all paths is represented by the concept tree, demonstrating the calculation of all the concepts intuitively and vividly.

As seen in Figure 13, all concepts are achieved completely and the calculation process is presented intuitively, according to the global formal concepts searching of attribute topology. Comparing Figures 13 and 14, as the part of Hasse diagram, the concept tree not only demonstrates the hierarchical structures among concepts clearly but is also much simpler than the Hasse diagram.

4. Conclusion

Global formal concepts searching of attribute topology is proposed in this paper based on the concepts of calculating with subtopologies. With the basic idea of Depth First Search, this algorithm, beginning with the global start point, employs the constraints and calculation rules to explore and backtrack the attributes of the degenerated topology repeatedly until traversal of all paths is achieved. The set of concepts is updated throughout the process and all concepts are obtained ultimately. This method avoids the decomposition process of the whole topology, which reflects the integrity of the algorithm. Visualization features of the calculation process are enhanced by the concept tree. This method makes the whole process more logical and feasible, easy to implement, and suitable for large-scale data sets. Attribute topology provides a new approach of representation of the formal context. The further step is to refine and optimize the method and put it into application.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is partially supported by the National Natural Science Foundation of China (nos. 61273019 and 81373767) and National Social Science Foundation of China (no. 12BYY121). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.