Table of Contents
Advances in Statistics
Volume 2014 (2014), Article ID 502678, 19 pages
http://dx.doi.org/10.1155/2014/502678
Review Article

Entering the Era of Data Science: Targeted Learning and the Integration of Statistics and Computational Data Analysis

1University of California, Berkeley, 108 Haviland Hall, Berkeley, CA 94720-7360, USA
2Department of Computer Science, Utrecht University, The Netherlands

Received 16 February 2014; Revised 9 July 2014; Accepted 10 July 2014; Published 10 September 2014

Academic Editor: Chin-Shang Li

Copyright © 2014 Mark J. van der Laan and Richard J. C. M. Starmans. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

This outlook paper reviews the research of van der Laan’s group on Targeted Learning, a subfield of statistics that is concerned with the construction of data adaptive estimators of user-supplied target parameters of the probability distribution of the data and corresponding confidence intervals, aiming at only relying on realistic statistical assumptions. Targeted Learning fully utilizes the state of the art in machine learning tools, while still preserving the important identity of statistics as a field that is concerned with both accurate estimation of the true target parameter value and assessment of uncertainty in order to make sound statistical conclusions. We also provide a philosophical historical perspective on Targeted Learning, also relating it to the new developments in Big Data. We conclude with some remarks explaining the immediate relevance of Targeted Learning to the current Big Data movement.