Computational Intelligence and Neuroscience

Volume 2016, Article ID 6156513, 9 pages

http://dx.doi.org/10.1155/2016/6156513

## Neural Networks Technique for Filling Gaps in Satellite Measurements: Application to Ocean Color Observations

NOAA Center for Weather and Climate Prediction, 5830 University Research Court, College Park, MD 20740, USA

Received 6 August 2015; Revised 23 October 2015; Accepted 26 October 2015

Academic Editor: José Alfredo Hernandez

Copyright © 2016 Vladimir Krasnopolsky et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

A neural network (NN) technique to fill gaps in satellite data is introduced, linking satellite-derived fields of interest with other satellites and* in situ* physical observations. Satellite-derived “ocean color” (OC) data are used in this study because OC variability is primarily driven by biological processes related and correlated in complex, nonlinear relationships with the physical processes of the upper ocean. Specifically, ocean color chlorophyll-a fields from NOAA’s operational Visible Imaging Infrared Radiometer Suite (VIIRS) are used, as well as NOAA and NASA ocean surface and upper-ocean observations employed—signatures of upper-ocean dynamics. An NN transfer function is trained, using global data for two years (2012 and 2013), and tested on independent data for 2014. To reduce the impact of noise in the data and to calculate a stable NN Jacobian for sensitivity studies, an ensemble of NNs with different weights is constructed and compared with a single NN. The impact of the NN training period on the NN’s generalization ability is evaluated. The NN technique provides an accurate and computationally cheap method for filling in gaps in satellite ocean color observation fields and time series.

#### 1. Introduction

The number of successful neural network (NN) applications in satellite remote sensing, meteorology, and oceanography has steadily increased over the last two decades [1, 2]. In these fields, NNs have been applied to address such problems as classification, feature extraction and tracking, pattern recognition, change detection, forward and inverse problems, and so forth. NNs also have been used to fill data gaps in measurement time series [3–5], while Peres et al. [6] used NNs to extend observation records.

NNs have also been applied in satellite remote sensing of ocean color (OC) (see references below). The “color” of the ocean is determined by the interactions of incident light with substances or particles present in the water because suspended particles will increase the scattering of light. In coastal areas, material in river runoff, resuspension of bottom material (sand, silt) by tides, waves, and storms, as well as biologically active components in the water column can change the color of near-shore waters. These components contain substances that absorb certain wavelengths of light, altering the optical signature. For example, microscopic marine algae (phytoplankton) have the capacity to absorb light in the blue and red regions of the spectrum due to the chlorophyll which enables photosynthesis. The underlying principle for remote sensing of ocean color (OC) is water with higher concentrations of phytoplankton (chlorophyll) is greener, while water with lower concentrations of phytoplankton is bluer [7]. Ocean color data is a vital resource for operational forecasting, oceanographic research, and earth sciences, along with a wide variety of related applications [8].

Satellite-derived OC fields are essential for numerical prediction applications, enabling numerical models to address a biophysical feedback process that is particularly important to coupled ocean-atmosphere modeling. As a new capability, integrating/assimilating near-real-time OC data into numerical ocean modeling improves forecast accuracy; consequently, a robust method needs to be developed for filling data gaps and providing the projected OC values needed to run the model into the future for predictions. The assimilation of OC data also drives/constrains the modeling of physical-biogeochemical processes that are the foundation for ecological forecasting; however, such models require continuous (gap-free) satellite ocean color fields for development, initialization, and data assimilation [8]. Thus, this work contributes a critical foundation for physical and biogeochemical modeling by providing continuous ocean color data and projected values. For the past three years, NOAA has been the developing capability to use near-real-time Visible Infrared Imaging Radiometer Suite (VIIRS) and other OC fields for its operational ocean [9, 10] and coupled seasonal forecast systems [11].

Multiple NN applications have been developed to solve forward and inverse problems in satellite ocean color remote sensing [12, and references there]. NNs also have been applied to merging OC information from multiple satellite missions [13]. In this work, we developed a new NN approach, which allows gaps (spatial and temporal) in satellite derived OC fields to be filled using physically related, but independently derived, satellite and* in situ* observations which provide physical information about the state of the upper layer of the ocean.

#### 2. Methodology

##### 2.1. Formulation of the Problem

Chlorophyll-a (Chl-a) concentration, a biological proxy for the intensity of photosynthesis derivable from ocean color observations, is affected by processes in the upper layers of the ocean of various spatial and temporal scales. Physical parameters characterizing the state of the ocean surface and upper mixed layer—temperature, salinity, and density—define the active physical background for associated biological processes; thus, variability of the physical background is responsible for a significant portion of the variability of entrained biological parameters. Accordingly, we can consider the OC, (in this case, the single parameter Chl-a), as a function or mapping of a vector of the ocean surface and upper mixed-layer state variables, . This mapping can be symbolically written as where denotes the mapping, is the dimensionality of the input space, and is the dimensionality of the output space.

This function/mapping is expected to be a complex nonlinear function because the variability of the physical parameters is transferred into the OC variability through a complex hierarchy of physical, chemical, and biological processes. Also, both the OC and ocean state data have finite spatial and temporal resolutions (provided on a grid with limited spatial resolution and averaged to daily temporal resolution); consequently, the physical and biological variability on scales finer than these resolutions appear as stochastic contributions to the OC, . Thus, the mapping between the OC, , and physical ocean variables, , is a complex, nonlinear stochastic mapping, The stochastic variable represents an uncertainty introduced into the OC, , due to unaccounted high-frequency small scale (subgrid) variability of physical, chemical, and biological processes. Also, all or a part of variables, constituting vectors and , are observations, which have different levels of noise. This noise also contributes into the stochastic variable . Assuming that stochastic part of the mapping is additive, representation (2) can be simplified, It is noteworthy that the uncertainty is an inherent informative part of the stochastic mapping, containing important statistical information about the mapping.* Actually, the stochastic mapping is a family of mappings distributed with a distribution function. The range and shape of the distribution function are determined by the uncertainty vector *.

##### 2.2. NN Emulation for the OC Mapping

Neural networks are very generic, accurate, and convenient mathematical models that emulate complicated nonlinear input/output relationships through statistical learning algorithms [14]. NNs can be applied to any problem that can be formulated as a mapping (input vector versus output vector dependence). The multilayer perceptron (MLP) with one hidden layer is a generic tool for approximating such mappings [2]. The simplest MLP NN analytical approximations use a family of functions likewhere and are components of the input and output vectors and , respectively, and and are NN weights. Here, the hyperbolic tangent is used as an activation function. Equation (4) is also a mapping, which can approximate any continuous or almost continuous (with final discontinuities) mapping [2, 15]. Symbolically, it can be represented as .

To train the NN that is emulating the mapping (1), an error function, , is created, and minimized to find an optimal set of coefficients and . However, for stochastic mapping (3), the training criterion should be modified as A single NN does not provide an adequate emulation/approximation of the stochastic mapping (3); therefore, an ensemble of NNs should be trained using criterion (6) [2]. If each NN member of the ensemble satisfies condition (6), this ensemble provides an adequate approximation for the stochastic mapping (3). Thus, to effectively account for subgrid scale effects and to reduce the impact of noise in NN simulated data (e.g., the Chl-a concentration), an ensemble of NNs was trained using criterion (6) and the average of this ensemble was used as the simulated OC, . In producing the ensemble of NNs to start the NN training, a slightly different initialization of NN weights, and , was chosen for each NN ensemble member; thus, different NN ensemble members correspond to different local minima of the error function (5), all satisfying condition (6). This simplest approach was selected because the data have a significant level of uncertainty/noise. The magnitude of the uncertainty estimated in Section 4.1.2 (Table 2) shows that the basic ensemble approach allows us to obtain an approximation error of magnitude close to the magnitude of uncertainty in the data. In our opinion, this result shows that, consistent with the parsimony principle, the use of more sophisticated approaches is not justified in this case.

The ensemble was also used to improve the stability of the NN Jacobian [16], which is used below for a sensitivity study. The Jacobians of each th NN ensemble member ( matrix of the first derivatives of the NN outputs over the input),were calculated and then averaged to calculate the mean Jacobian used for the sensitivity study below. Formally speaking, the Jacobian of the MLP NN (4) can be easily calculated using direct differentiation,However, calculating the derivative of any statistical model (including NN) is an ill-posed problem [2] which should be regularized. As shown by Krasnopolsky [16], the problem can be solved using an NN ensemble and calculating the Jacobian as an average of Jacobians of the NN ensemble members. This approach is used in this effort.

##### 2.3. Selecting NN Inputs and Outputs

Selecting the emulating NN architecture includes selecting NN inputs, NN outputs, and the number of hidden neurons, . For this study, we selected one output—chlorophyll-a concentration. The vector of inputs, , was composed of two parts , where is a vector of physical parameters, which includes satellite sea-surface elevation (SSH), sea-surface salinity (SSS), and sea-surface temperature (SST) and* in situ* Argo salinity (sal) and temperature (temp) vertical profiles. It can be expressed asand vector is a vector of auxiliary or meta variables or tracers configured aswhere yr is the year, ( equals the day of the year), and lon and lat are, respectively, longitude and latitude in radians.

Metadata are included in the input vector, , to permit training a single NN (or single ensemble of NNs) that, given the input for a particular location on the globe (lat and lon) at a particular moment in time (yr and ), provides output (Chl-a concentration) for the same location and time. This NN is trained using records collected over several years at locations representing the entire globe. The trained NN (or NN ensemble), using the same weights, then is used for the entire globe for a long period subsequent to the training interval. Hence, each trained NN (a single NN or NN ensemble member) takes information from one grid point and produces a simulated value of OC (Chl-a) for that grid point at the corresponding time. The same single NN or NN ensemble then moves to the next grid point of the global grid, producing, in this way, a global field of OC (Chl-a). The results presented below are obtained with the input vector , comprising (11), including all metadata variables, and (10), including three surface variables plus variables representing seven upper layers of sal and temp from Argo profiles. Thus, in this study, the NNs emulating the OC mapping each have 23 inputs and 1 output. Table 1 lists these variables and their units, as well as the output parameter.