| Input: the document collection F = {f1, f2, …, fn}, a semantic dictionary D generated by applying “Word2Vec” to F. |
| Output: the index tree T. |
(1) | for each do: |
(2) | Construct a leaf node u for fi, with u.ID = GenID (), u.Pl = u.Pr = NULL, u.FID = GenFID (fi,), and generate and according to the method M1; |
(3) | Insert u to CurrentNodeSet; |
(4) | end for |
(5) | while |CurrentNodeSet| ≥ 1 do: |
(6) | if |CurrentNodeSet| is even, i.e. 2h then: |
(7) | for each pair of nodes u′ and u″ in CurrentNodeSet do: |
(8) | Create a parent node u for u′ and u″, with u.ID = GenID (), u.Pl = u′, u. Pr = u″, u.FID = NULL, and set and according to the method M2; |
(9) | Insert u to TempNodeSet; |
(10) | end for |
(11) | else \\Suppose that |CurrentNodeSet| = 2h + 1 |
(12) | for each pair of nodes u′ and u″ of the former 2h − 2 nodes in CurrentNodeSet do: |
(13) | Create a parent node u for u′ and u″; |
(14) | Insert u to TempNodeSet; |
(15) | end for |
(16) | Create a parent node u1 for the (2h − 1)-th and (2h)-th nodes, and then generate a parent node u for the (2h + 1)-th node and u1; |
(17) | Insert u to TempNodeSet; |
(18) | end if |
(19) | Set CurrentNodeSet = TempNodeSet and clear TempNodeSet; |
(20) | end while |
(21) | return CurrentNodeSet; |
(22) | \\Note that the CurrentNodeSet only contains one node which is the root of the index tree T. |