Table of Contents Author Guidelines Submit a Manuscript
Wireless Communications and Mobile Computing
Volume 2017 (2017), Article ID 9474806, 16 pages
Research Article

Vision-Based Fall Detection with Convolutional Neural Networks

1DeustoTech, University of Deusto, Avenida de las Universidades, No. 24, 48007 Bilbao, Spain
2Department of Computer Science and Artificial Intelligence, Basque Country University, P. Manuel Lardizabal 1, 20018 San Sebastian, Spain
3Ikerbasque, Basque Foundation for Science, Maria Diaz de Haro 3, 48013 Bilbao, Spain
4Donostia International Physics Center (DIPC), P. Manuel Lardizabal 4, 20018 San Sebastian, Spain

Correspondence should be addressed to Adrián Núñez-Marcos

Received 14 July 2017; Revised 26 September 2017; Accepted 9 November 2017; Published 6 December 2017

Academic Editor: Wiebren Zijlstra

Copyright © 2017 Adrián Núñez-Marcos et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


One of the biggest challenges in modern societies is the improvement of healthy aging and the support to older persons in their daily activities. In particular, given its social and economic impact, the automatic detection of falls has attracted considerable attention in the computer vision and pattern recognition communities. Although the approaches based on wearable sensors have provided high detection rates, some of the potential users are reluctant to wear them and thus their use is not yet normalized. As a consequence, alternative approaches such as vision-based methods have emerged. We firmly believe that the irruption of the Smart Environments and the Internet of Things paradigms, together with the increasing number of cameras in our daily environment, forms an optimal context for vision-based systems. Consequently, here we propose a vision-based solution using Convolutional Neural Networks to decide if a sequence of frames contains a person falling. To model the video motion and make the system scenario independent, we use optical flow images as input to the networks followed by a novel three-step training phase. Furthermore, our method is evaluated in three public datasets achieving the state-of-the-art results in all three of them.