Table of Contents Author Guidelines Submit a Manuscript
Journal of Sensors
Volume 2015, Article ID 834217, 11 pages
http://dx.doi.org/10.1155/2015/834217
Research Article

Architecture and Implementation of a Scalable Sensor Data Storage and Analysis System Using Cloud Computing and Big Data Technologies

Computer Engineering Department, Firat University, 23100 Elazig, Turkey

Received 6 February 2015; Accepted 20 February 2015

Academic Editor: Sergiu Dan Stan

Copyright © 2015 Galip Aydin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. Comparison of Relational Database Systems, http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_system.
  2. G. Press, Internet of Things by the Numbers: Market Estimates and Forecasts, http://www.forbes.com/.
  3. K. Ashton, That “Internet of Things” Thing, 2015, http://www.rfidjournal.com/articles/view?4986.
  4. I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, “Wireless sensor networks: a survey,” Computer Networks, vol. 38, no. 4, pp. 393–422, 2002. View at Publisher · View at Google Scholar · View at Scopus
  5. R. Cattell, “Scalable SQL and NoSQL data stores,” ACM SIGMOD Record, vol. 39, no. 4, pp. 12–27, 2010. View at Publisher · View at Google Scholar · View at Scopus
  6. M. Miler, D. Medak, and D. Odobašić, “Two-tier architecture for web mapping with NoSQL database couch DB,” Geospatial Crossroads GI Forum, vol. 11, pp. 62–71, 2011. View at Google Scholar
  7. J. S. van der Veen, B. van der Waaij, and R. J. Meijer, “Sensor data storage performance: SQL or NoSQL, physical or virtual,” in Proceedings of the IEEE 5th International Conference on Cloud Computing (CLOUD '12), pp. 431–438, IEEE, June 2012. View at Publisher · View at Google Scholar · View at Scopus
  8. J. Gantz and D. Reinsel, Extracting Value from Chaos State of the Universe, IDC (International Data Corporation), 2011.
  9. IEEE XPLORE, “Year in Review: Top Search Terms in IEEE Xplore,” http://ieeexplore.ieee.org/Xplore/.
  10. A. Katal, M. Wazid, and R. H. Goudar, “Big data: issues, challenges, tools and good practices,” in Proceedings of the 6th International Conference on Contemporary Computing (IC3 '13), pp. 404–409, IEEE, Noida, India, August 2013. View at Publisher · View at Google Scholar · View at Scopus
  11. J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008. View at Publisher · View at Google Scholar · View at Scopus
  12. Official Hadoop Web Site, 2015, http://hadoop.apache.org/.
  13. C. Sweeney, L. Liu, S. Arietta, J. Lawrence, and B. S. Thesis, HIPI: a Hadoop Image Processing Interface for Image-Based Mapreduce Tasks, University of Virginia, Charlottesville, Va, USA, 2011.
  14. T. White, Hadoop: The Definitive Guide, O'Reilly Media, 2009.
  15. D. Borthakur, HDFS Architecture Guide, Hadoop Apache Project, 2008.
  16. C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, “Pig latin: a not-so-foreign language for data processing,” in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '08), pp. 1099–1110, ACM, June 2008. View at Publisher · View at Google Scholar · View at Scopus
  17. L. George, HBase: The Definitive Guide, O'Reilly Media, 2011.
  18. A. Thusoo, J. S. Sarma, N. Jain et al., “Hive: a warehousing solution over a map-reduce framework,” Proceedings of the VLDB Endowment, vol. 2, no. 2, pp. 1626–1629, 2009. View at Publisher · View at Google Scholar
  19. S. Owen, R. Anil, T. Dunning, and E. Friedman, Mahout in Action, Manning Publications, 2011.
  20. M. Nemschoff, Big Data: 5 Major Advantages of Hadoop, http://www.itproportal.com/.
  21. M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: cluster computing with working set,” in Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, 2010.
  22. Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst, “HaLoop: efficient iterative data processing on large clusters,” Proceedings of the VLDB Endowment, vol. 3, no. 1-2, pp. 285–296, 2010. View at Publisher · View at Google Scholar
  23. J. Ekanayake, H. Li, B. Zhang et al., “Twister: a runtime for iterative mapreduce,” in Proceedings of the ACM International Symposium on High Performance Distributed Computing (HPDC '10), pp. 810–818, ACM, June 2010. View at Publisher · View at Google Scholar · View at Scopus
  24. S. Madden, “From databases to big data,” IEEE Internet Computing, vol. 16, no. 3, pp. 4–6, 2012. View at Publisher · View at Google Scholar · View at Scopus
  25. D. Kourtesis, J. M. Alvarez-Rodríguez, and I. Paraskakis, “Semantic-based QoS management in cloud systems: current status and future challenges,” Future Generation Computer Systems, vol. 32, no. 1, pp. 307–323, 2014. View at Publisher · View at Google Scholar · View at Scopus
  26. S. Ghemawat, H. Gobioff, and S. T. Leung, “The google file system,” in Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP '03), pp. 29–43, October 2003. View at Scopus
  27. A. Bialecki, M. Cafarella, D. Cutting, and O. O'Malley, Hadoop: A Framework for Running Applications on Large Clusters Built of Commodity Hardware, Wiki, 2005, http://lucene.apache.org/hadoop.
  28. OpenStack, 2015, http://www.openstack.org.
  29. OpenNebula Web page, 2015, http://www.opennebula.org.
  30. Eucalyptus, 2015, https://www.eucalyptus.com/eucalyptus-cloud/iaas.
  31. T. Gunarathne, T.-L. Wu, J. Qiu, and G. Fox, “MapReduce in the clouds for science,” in Proceedings of the 2nd IEEE International Conference on Cloud Computing Technology and Science (CloudCom '10), pp. 565–572, IEEE, December 2010. View at Publisher · View at Google Scholar · View at Scopus
  32. Amazon Web Services, 2015, http://aws.amazon.com.
  33. RapidMiner Predictive Analysis, 2015, https://rapidminer.com/.
  34. G. Holmes, A. Donkin, and I. H. Witten, “Weka: a machine learning workbench,” in Proceedings of the 2nd Australian and New Zealand Conference on Intelligent Information Systems, pp. 357–361, Brisbane, Australia, December 1994. View at Publisher · View at Google Scholar
  35. A. Mahout, “Scalable machine-learning and data-mining library,” http://mahout.apache.org/.
  36. K. Ericson and S. Pallickara, “On the performance of high dimensional data clustering and classification algorithms,” Future Generation Computer Systems, vol. 29, no. 4, pp. 1024–1034, 2013. View at Publisher · View at Google Scholar · View at Scopus
  37. R. M. Esteves, R. Pais, and C. Rong, “K-means clustering in the cloud—a Mahout test,” in Proceedings of the IEEE Workshops of International Conference on Advanced Information Networking and Applications (WAINA '11), pp. 514–519, IEEE, 2011.
  38. Spark MLLib scalable machine learning library, https://spark.apache.org/mllib/.
  39. M. Zaharia, M. Chowdhury, T. Das et al., “Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing,” in Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI '12), USENIX Association, 2012.
  40. S. Shahrivari, “Beyond batch processing: towards real-time and streaming big data,” Computers, vol. 3, no. 4, pp. 117–129, 2014. View at Publisher · View at Google Scholar
  41. H. Wang, B. Wu, S. Yang, and B. Wang, “Research of decision tree on YARN using 16 MapReduce and spark,” in Proceedings of the The 2014 World Congress in Computer Science, Computer Engineering, and Applied Computing, Las Vegas, Nev, USA, 2014.
  42. D. Lawson, Alternating Direction Method of Multipliers Implementation Using Apache Spark, 2014.
  43. C.-Y. Lin, C.-H. Tsai, C.-P. Lee, and C.-J. Lin, “Large-scale logistic regression and linear support vector machines using spark,” in Proceedings of the IEEE International Conference on Big Data, pp. 519–528, Washington, DC, USA, October 2014. View at Publisher · View at Google Scholar
  44. F. Liang, C. Feng, X. Lu, and Z. Xu, “Performance benefits of DataMPI: a case study with BigDataBench,” in Big Data Benchmarks, Performance Optimization, and Emerging Hardware, vol. 8807 of Lecture Notes in Computer Science, pp. 111–123, Springer International Publishing, Cham, Switzerland, 2014. View at Publisher · View at Google Scholar
  45. Wikipedia, “Global Positioning System,” http://en.wikipedia.org/wiki/Global_Positioning_System.
  46. Yonca CBS, “Naviskop Vehicle Tracking Systems,” 2015, http://www.naviskop.com/.
  47. J. Han, K. Koperski, and N. Stefanovic, “GeoMiner: a system prototype for spatial data mining,” ACM SIGMOD Record, vol. 26, no. 2, pp. 553–556, 1997. View at Publisher · View at Google Scholar
  48. C. J. Moran and E. N. Bui, “Spatial data mining for enhanced soil map modelling,” International Journal of Geographical Information Science, vol. 16, no. 6, pp. 533–549, 2002. View at Publisher · View at Google Scholar · View at Scopus
  49. R. T. Ng and J. Han, “Clarans: a method for clustering objects for spatial data mining,” IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 5, pp. 1003–1016, 2002. View at Publisher · View at Google Scholar · View at Scopus
  50. S. Shekhar, P. Zhang, and Y. Huang, “Spatial data mining,” in Data Mining and Knowledge Discovery Handbook, pp. 833–851, Springer, 2005. View at Publisher · View at Google Scholar
  51. Quick Server, February 2015, http://www.quickserver.org/.
  52. S. K. Divakar Mysore and S. Jain, Big Data Architecture and Patterns, Part 1: Introduction to Big Data Classification and Architecture, IBM Big Data and Analytics, Technical Library, 2013.
  53. P. Membrey, E. Plugge, and D. Hawkins, The Definitive Guide to MongoDB: the noSQL Database for Cloud and Desktop Computing, Apress, 2010.
  54. A. Boicea, F. Radulescu, and L. I. Agapin, “MongoDB vs oracle-database comparison,” in Proceedings of the 3rd International Conference on Emerging Intelligent Data and Web Technologies (EIDWT '12), pp. 330–335, September 2012. View at Publisher · View at Google Scholar · View at Scopus
  55. E. Dede, M. Govindaraju, D. Gunter, R. S. Canon, and L. Ramakrishnan, “Performance evaluation of a MongoDB and Hadoop platform for scientific data analysis,” in Proceedings of the 4th ACM Workshop on Scientific Cloud Computing (ScienceCloud '13), pp. 13–20, ACM, June 2013. View at Publisher · View at Google Scholar · View at Scopus
  56. Y. Liu, Y. Wang, and Y. Jin, “Research on the improvement of MongoDB Auto-Sharding in cloud environment,” in Proceedings of the 7th International Conference on Computer Science & Education (ICCSE '12), pp. 851–854, IEEE, Melbourne, Australia, July 2012. View at Publisher · View at Google Scholar · View at Scopus
  57. Z. Parker, S. Poe, and S. V. Vrbsky, “Comparing nosql mongodb to an sql db,” in Proceedings of the 51st ACM Southeast Conference, ACM, April 2013. View at Publisher · View at Google Scholar · View at Scopus
  58. Z. Wei-Ping, L. Ming-Xin, and C. Huan, “Using MongoDB to implement textbook management system instead of MySQL,” in Proceedings of the IEEE 3rd International Conference on Communication Software and Networks (ICCSN '11), pp. 303–305, IEEE, May 2011. View at Publisher · View at Google Scholar · View at Scopus
  59. K. Jackson, OpenStack Cloud Computing Cookbook, Packt Publishing Ltd., 2012.
  60. O. Sefraoui, M. Aissaoui, and M. Eleuldj, “OpenStack: toward an open-source solution for cloud computing,” International Journal of Computer Applications, vol. 55, no. 3, pp. 38–42, 2012. View at Publisher · View at Google Scholar
  61. C. P. Chen and C.-Y. Zhang, “Data-intensive applications, challenges, techniques and technologies: a survey on Big Data,” Information Sciences, vol. 275, pp. 314–347, 2014. View at Publisher · View at Google Scholar · View at Scopus
  62. S. Gao, L. Li, W. Li, K. Janowicz, and Y. Zhang, “Constructing gazetteers from volunteered Big Geo-Data based on Hadoop,” Computers, Environment and Urban Systems, 2014. View at Publisher · View at Google Scholar · View at Scopus
  63. S. Brooker, S. Clarke, J. K. Njagi et al., “Spatial clustering of malaria and associated risk factors during an epidemic in a highland area of western Kenya,” Tropical Medicine and International Health, vol. 9, no. 7, pp. 757–766, 2004. View at Publisher · View at Google Scholar · View at Scopus
  64. T. Cheng, J. Haworth, B. Anbaroglu, G. Tanaksaranond, and J. Wang, “Spatiotemporal data mining,” in Handbook of Regional Science, pp. 1173–1193, Springer, Berlin, Germany, 2014. View at Google Scholar
  65. S. Wang and H. Yuan, “Spatial data mining: a perspective of big data,” International Journal of Data Warehousing and Mining, vol. 10, no. 4, pp. 50–70, 2014. View at Google Scholar
  66. Y. J. Akhila, A. Naik, B. Hegde, P. Shetty, and A. J. K. Mohan, “SD miner—a spatial data mining system,” International Journal of Research, vol. 1, no. 5, pp. 563–567, 2014. View at Google Scholar
  67. R. Sharma, M. A. Alam, and A. Rani, “K-means clustering in spatial data mining using weka interface,” International Journal of Computer Applications, pp. 26–30, 2012, Proceedings of the International Conference on Advances in Communication and Computing Technologies (ICACACT '12). View at Google Scholar
  68. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA data mining software: an update,” ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10–18, 2009. View at Publisher · View at Google Scholar