Research Article
Gene Sequence Clustering Based on the Profile Hidden Markov Model with Differential Identifiability
Algorithm 2
Gene sequence clustering algorithm based on DI-PHMM (DI-GSCA).
ā | Input: Number of clusters , training sequence data , DI parameter and round number of iteration (optional) | ā | Output: Index of the cluster to which the sequence belongs where | (1) | | (2) | for in | (3) | for in | (4) | //Calculate the score of the sequence for each PHMM | (5) | | (6) | //Divide the sequence into the corresponding cluster according to the highest score | (7) | | (8) | for in | (9) | if (): | (10) | | (11) | else | (12) | //The privacy parameter is assigned according to whether the number of iteration rounds is fixed. | (13) | //Construct a new cluster center sub-model | (14) | | (15) | //The degree of change of the model from the last iteration (divergence distance) | (16) | |
|