Input: |
(i) The candidate topic discriminative terms of a given document. |
Output: |
(ii) The set of topical clusters with topical describing information. |
Procedure: |
(1) Computing the re-occurrences topic span for each candidate topic discriminative term. |
(2) Ordering by the length of re-occurrences topic span for each candidate topic discriminative term. |
(3) repeat |
(4) Proceeding iteratively, starting with the longest re-occurrences topic span of TDT, ending with |
the shortest and last TDT. |
(5) Judging the overlap relationship of arbitrary two TDT’s [, ]. |
(6) if (Two topic span intervals of TDT are intersecting) |
(7) Two vertices of corresponding topic span intervals of TDT will be directly connected. |
(8) else |
(9) Two vertices will respectively belong to different topic span intervals, namely belong to |
different subgraphs. |
(10) until all independent subgraphs are constructed; |
(11) Labeling each subgraph for , where . |
(12) All independent subgraphs will be connected with the vertices that are directly proximal |
in the corresponding TDT’s topic span, so a whole topical graph is constructed. |
(13) repeat |
(14) if (there exists one TDT which is monosemous) |
(15) Select the higher weight one as initial vertex. |
(16) Iteratively calculate the similarity of topic chains between this vertex and other ones with its |
neighbors by formula (3). |
(17) else |
(18) Iteratively calculate the similarity of topic chains from the maximal weight of two |
vertices by formula (4). |
(19) until all candidate TDTs only exist the unique topic chain; |
(20) Updating the weight of edges by formula (5) and pruning completely irrelevant edges in |
the topical graph. |
(21) Readjusting a topical conflict interval between two vertices of corresponding TDTs which |
are connected by the pruned edge. |
(22) S <- choosing the vertices of the maximum degree in the topical graph. |
(23) Integrating the other vertices into topical clusters according to the adjacency and |
similarity relationship. |
(24) Ignoring the isolated individuals and too small topical clusters. |
(25) Generating topical describing information to implement topic identification. |
(26) Return The set of topical clusters with topical describing information. |