Table of Contents Author Guidelines Submit a Manuscript
Computational and Mathematical Methods in Medicine
Volume 2016 (2016), Article ID 1782732, 8 pages
http://dx.doi.org/10.1155/2016/1782732
Research Article

Use of a Novel Grammatical Inference Approach in Classification of Amyloidogenic Hexapeptides

1Faculty of Computer Science and Materials Science, University of Silesia, Ulica Zytnia 12, 41-200 Sosnowiec, Poland
2Department of Computer Engineering, Faculty of Electronics, Wroclaw University of Science and Technology, Wybrzeże Wyspianskiego 27, 50-370 Wroclaw, Poland

Received 22 October 2015; Accepted 17 February 2016

Academic Editor: Humberto González-Díaz

Copyright © 2016 Wojciech Wieczorek and Olgierd Unold. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. C. de la Higuera, Grammatical Inference: Learning Automata and Grammars, Cambridge University Press, 2010. View at Publisher · View at Google Scholar · View at MathSciNet
  2. J. Tian, N. Wu, J. Guo, and Y. Fan, “Prediction of amyloid fibril-forming segments based on a support vector machine,” BMC Bioinformatics, vol. 10, supplement 1, article S45, 2009. View at Publisher · View at Google Scholar · View at Scopus
  3. S. Maurer-Stroh, M. Debulpaep, N. Kuemmerer et al., “Exploring the sequence determinants of amyloid structure using position-specific scoring matrices,” Nature Methods, vol. 7, no. 3, pp. 237–242, 2010. View at Publisher · View at Google Scholar · View at Scopus
  4. D. Angluin, An application of the theory of computational complexity to the study of inductive inference [Ph.D. thesis], University of California, Oakland, Calif, USA, 1976.
  5. W. Wieczorek and O. Unold, “Induction of directed acyclic word graph in a bioinformatics task,” in Proceedings of the 12th International Conference of Grammatical Inference, vol. 34 of JMLR Workshop and Conference Proceedings, pp. 207–217, Kyoto, Japan, September 2014.
  6. H. Rulot and E. Vidal, “Modelling (sub)string length based constraints through a grammatical inference method,” in Pattern Recognition Theory and Applications, P. A. Devijver and J. Kittler, Eds., vol. 30 of NATO ASI Series, pp. 451–459, Springer, 1987. View at Publisher · View at Google Scholar
  7. D. Angluin, “Inference of reversible languages,” Journal of the ACM, vol. 29, no. 3, pp. 741–765, 1982. View at Publisher · View at Google Scholar · View at MathSciNet
  8. P. Garcia, E. Vidal, and J. Oncina, Learning Locally Testable Languages in the Strict Sense, ALT, 1990.
  9. F. Coste and G. Kerbellec, “Learning automata on protein sequences,” in 7th Journées Ouvertes Biologie Informatique Mathématiques (JOBIM '06), pp. 199–210, Bordeaux, France, July 2006.
  10. Y. Sakakibara, “Grammatical inference in bioinformatics,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 7, pp. 1051–1062, 2005. View at Publisher · View at Google Scholar · View at Scopus
  11. D. B. Searls, “The language of genes,” Nature, vol. 420, no. 6912, pp. 211–217, 2002. View at Publisher · View at Google Scholar · View at Scopus
  12. E. Alpaydin, Introduction to Machine Learning, MIT Press, Cambridge, Mass, USA, 2nd edition, 2010.
  13. L. Polkowski and A. Skowron, Rough Sets in Knowledge Discovery 2: Applications, Case Studies and Software Systems, Physica, 1998.
  14. C. P. Jaroniec, C. E. MacPhee, V. S. Bajaj, M. T. McMahon, C. M. Dobson, and R. G. Griffin, “High-resolution molecular structure of a peptide in an amyloid fibril determined by magic angle spinning NMR spectroscopy,” Proceedings of the National Academy of Sciences of the United States of America, vol. 101, no. 3, pp. 711–716, 2004. View at Publisher · View at Google Scholar · View at Scopus
  15. V. N. Uversky and A. L. Fink, “Conformational constraints for amyloid fibrillation: the importance of being unfolded,” Biochimica et Biophysica Acta (BBA)—Proteins and Proteomics, vol. 1698, no. 2, pp. 131–153, 2004. View at Publisher · View at Google Scholar · View at Scopus
  16. M. J. Thompson, S. A. Sievers, J. Karanicolas, M. I. Ivanova, D. Baker, and D. Eisenberg, “The 3D profile method for identifying fibril-forming segments of proteins,” Proceedings of the National Academy of Sciences of the United States of America, vol. 103, no. 11, pp. 4074–4078, 2006. View at Publisher · View at Google Scholar · View at Scopus
  17. S. J. Hamodrakas, “Protein aggregation and amyloid fibril formation prediction software from primary sequence: towards controlling the formation of bacterial inclusion bodies,” The FEBS Journal, vol. 278, no. 14, pp. 2428–2435, 2011. View at Publisher · View at Google Scholar · View at Scopus
  18. J. Stanislawski, M. Kotulska, and O. Unold, “Machine learning methods can replace 3D profile method in classification of amyloidogenic hexapeptides,” BMC Bioinformatics, vol. 14, no. 1, article 21, 2013. View at Publisher · View at Google Scholar · View at Scopus
  19. O. Unold, “Fuzzy grammar-based prediction of amyloidogenic regions,” JMLR: Workshop and Conference Proceedings, vol. 21, pp. 210–219, 2012. View at Google Scholar
  20. O. Unold, “How to support prediction of amyloidogenic regions—the use of a GA-based wrapper feature selections,” in Proceedings of the 2nd International Conference on Advances in Information Mining and Management (IMMM '12), Venice, Italy, October 2012.
  21. B. Liu, W. Zhang, L. Jia, J. Wang, X. Zhao, and M. Yin, “Prediction of ‘aggregation-prone’ peptides with hybrid classification approach,” Mathematical Problems in Engineering, vol. 2015, Article ID 857325, 9 pages, 2015. View at Publisher · View at Google Scholar
  22. K. J. Lang, “Random DFA's can be approximately learned from sparse uniform examples,” in Proceedings of the 5th Annual Workshop on Computational Learning Theory (COLT '92), pp. 45–52, ACM, Pittsburgh, Pa, USA, July 1992. View at Publisher · View at Google Scholar
  23. K. J. Lang, B. A. Pearlmutter, and R. A. Price, “Results of the abbadingo one DFA learning competition and a new evidence-driven state merging algorithm,” in Proceedings of the 4th International Colloquium on Grammatical Inference, (ICGI '98) Ames, Iowa, USA, July 1998, pp. 1–12, Springer, 1998.
  24. K. J. Lang, “Merge Order count,” Tech. Rep., NECI, Montpelier, Vt, USA, 1997. View at Google Scholar
  25. Z. Solan, D. Horn, E. Ruppin, and S. Edelman, “Unsupervised learning of natural languages,” Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 33, pp. 11629–11634, 2005. View at Publisher · View at Google Scholar · View at Scopus
  26. C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  27. J. Beerten, J. Van Durme, R. Gallardo et al., “WALTZDB: a benchmark database of amyloidogenic hexapeptides,” Bioinformatics, vol. 31, no. 10, pp. 1698–1700, 2015. View at Publisher · View at Google Scholar
  28. E. Tomita, A. Tanaka, and H. Takahashi, “The worst-case time complexity for generating all maximal cliques and computational experiments,” Theoretical Computer Science, vol. 363, no. 1, pp. 28–42, 2006. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet · View at Scopus
  29. B. Trakhtenbrot and Y. Barzdin, Finite Automata: Behavior and Synthesis, North-Holland Publishing, 1973.
  30. F. Pedregosa, G. Varoquaux, A. Gramfort et al., “Scikit-learn: machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011. View at Google Scholar
  31. T. G. Dietterich, “Approximate statistical tests for comparing supervised classification learning algorithms,” Neural Computation, vol. 10, no. 7, pp. 1895–1923, 1998. View at Publisher · View at Google Scholar · View at Scopus
  32. S. Kotsiantis and P. Pintelas, “Combining bagging and boosting,” International Journal of Computational Intelligence, vol. 1, no. 4, pp. 324–333, 2004. View at Google Scholar
  33. R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” in Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI '95), vol. 2, pp. 1137–1143, Montreal, Canada, August 1995.
  34. U. M. Braga-Neto and E. R. Dougherty, “Is cross-validation valid for small-sample microarray classification?” Bioinformatics, vol. 20, no. 3, pp. 374–380, 2004. View at Publisher · View at Google Scholar · View at Scopus
  35. T. Hastie, R. Tibshirani, J. Friedman, T. Hastie, J. Friedman, and R. Tibshirani, The Elements of Statistical Learning, vol. 2, no. 1, Springer, Berlin, Germany, 2009.
  36. S. Arlot and A. Celisse, “A survey of cross-validation procedures for model selection,” Statistics Surveys, vol. 4, pp. 40–79, 2010. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  37. D. Krstajic, L. J. Buturovic, D. E. Leahy, and S. Thomas, “Cross-validation pitfalls when selecting and assessing regression and classification models,” Journal of Cheminformatics, vol. 6, no. 1, article 10, 2014. View at Publisher · View at Google Scholar · View at Scopus
  38. C. Nadeau and Y. Bengio, “Inference for the generalization error,” Machine Learning, vol. 52, no. 3, pp. 239–281, 2003. View at Publisher · View at Google Scholar · View at Scopus
  39. R. R. Bouckaert and E. Frank, “Evaluating the replicability of significance tests for comparing learning algorithms,” in Advances in Knowledge Discovery and Data Mining, H. Dai, R. Srikant, and C. Zhang, Eds., vol. 3056 of Lecture Notes in Computer Science, pp. 3–12, Springer, 2004. View at Publisher · View at Google Scholar
  40. J. Demšar, “Statistical comparisons of classifiers over multiple data sets,” The Journal of Machine Learning Research, vol. 7, pp. 1–30, 2006. View at Google Scholar · View at MathSciNet
  41. N. Japkowicz and M. Shah, Evaluating Learning Algorithms: A Classification Perspective, Cambridge University Press, Cambridge, UK, 2011.
  42. J. P. Romano, A. M. Shaikh, and M. Wolf, “Control of the false discovery rate under dependence using the bootstrap and subsampling,” TEST, vol. 17, no. 3, pp. 417–442, 2008. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  43. S. Holm, “A simple sequentially rejective multiple test procedure,” Scandinavian Journal of Statistics, vol. 6, no. 2, pp. 65–70, 1979. View at Google Scholar · View at MathSciNet
  44. B. W. Matthews, “Comparison of the predicted and observed secondary structure of T4 phage lysozyme,” Biochimica et Biophysica Acta (BBA)—Protein Structure, vol. 405, no. 2, pp. 442–451, 1975. View at Publisher · View at Google Scholar · View at Scopus
  45. M. Emily, A. Talvas, and C. Delamarche, “MetAmyl: a METa-predictor for AMYLoid proteins,” PLoS ONE, vol. 8, no. 11, Article ID e79722, 2013. View at Publisher · View at Google Scholar · View at Scopus