Research Article

A Dirichlet Process Mixture Based Name Origin Clustering and Alignment Model for Transliteration

Algorithm 1

The Blocked Gibbs Sampling Algorithm.
Input:
 Random initial corpus segmentation
Output:
  Unsupervised co-segmentation of the corpus according to the model
() for each   do
()  for each blingual word-pair   do
()   for co-segmentation of of   do
()    Compute probability ;
()    Where is the distribution of all s that have been generated before except that from ;
()   end for
()   Sample a from the distribution of ;
()   Update counts;
()  end for
() end for