Table of Contents Author Guidelines Submit a Manuscript
Mobile Information Systems
Volume 2016 (2016), Article ID 1784101, 10 pages
Research Article

An Analysis of Audio Features to Develop a Human Activity Recognition Model Using Genetic Algorithms, Random Forests, and Neural Networks

1Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juarez 147 Centro, 98000 Zacatecas, ZAC, Mexico
2Instituto Tecnológico Superior Zacatecas Sur, Av. Tecnológico 100, Las Moritas, 99700 Tlaltenango, ZAC, Mexico
3Unidad Académica de Medicina Humana y Ciencias de la Salud, Universidad Autónoma de Zacatecas, Jardín Juarez 147 Centro, 98000 Zacatecas, ZAC, Mexico

Received 1 June 2016; Revised 31 August 2016; Accepted 29 September 2016

Academic Editor: Daniele Riboni

Copyright © 2016 Carlos E. Galván-Tejada et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


This work presents a human activity recognition (HAR) model based on audio features. The use of sound as an information source for HAR models represents a challenge because sound wave analyses generate very large amounts of data. However, feature selection techniques may reduce the amount of data required to represent an audio signal sample. Some of the audio features that were analyzed include Mel-frequency cepstral coefficients (MFCC). Although MFCC are commonly used in voice and instrument recognition, their utility within HAR models is yet to be confirmed, and this work validates their usefulness. Additionally, statistical features were extracted from the audio samples to generate the proposed HAR model. The size of the information is necessary to conform a HAR model impact directly on the accuracy of the model. This problem also was tackled in the present work; our results indicate that we are capable of recognizing a human activity with an accuracy of 85% using the HAR model proposed. This means that minimum computational costs are needed, thus allowing portable devices to identify human activities using audio as an information source.