Research Article
Gene Prioritization of Resistant Rice Gene against Xanthomas oryzae pv. oryzae by Using Text Mining Technologies
Algorithm 1
Gene prioritization algorithm.
Step 1. Collect NCBI literature in the rice research field, denote the text database as , here = “rice”, “Event”, | “Binding”, “Catabolism”, “Expression”, “Localization”, “phosphorylation”, “regulation”, “transcript”, "Xoo”; | Step 2. Build phrase dictionary, denote the terms as . | Step 3. Evaluate the relevance between and by computing TF*IDF(), here is the total text data set. | Step 4. Rank important . | Step 5. Retrieve protein in NCBI with annotation include . | Step 6. Rank candidate protein by using the built-in classifier [17] which is sequence-based. | Step 7. Use Conserved Domain Data (CDD) and Gene Ontology (GO) to verify the result of prioritization. |
|