Research Article

Predicting Component Failures Using Latent Dirichlet Allocation

Figure 1

Process of failures prediction of components using LDA. (a) Component failures and source code of components are extracted from a bug database and the source code repository. (b) Failure density is defined as the ratio of failures and number of files. Com1, Com2, and Com3 indicate three components. We map failure density to topics by using the estimated topic distribution of components. , , and indicate three topics. Next, we get the TFD. (c) A similarity matrix is calculated to depict the similarity of topics from the previous version and next version. At last, based on TFD (previous version) and the similarity matrix, we predict the TFD (next version) and component failures.