Abstract

Two strategies for estimating open boundary conditions (OBCs) with adjoint method are compared by carrying out semi-idealized numerical experiments. In the first strategy, the OBC is assumed to be partly space varying and generated by linearly interpolating the values at selected feature points. The advantage is that the values at feature points are taken as control variables so that the variations of the curves can be reproduced by the minimum number of points. In the second strategy, the OBC is assumed to be fully space varying and the values at every open boundary points are taken as control variables. A series of semi-idealized experiments are carried out to compare the effectiveness of two inversion strategies. The results demonstrate that the inversion effect is in inverse proportion to the number of feature points which characterize the spatial complexity of open boundary forcing. The effect of ill-posedness of inverse problem will be amplified if the observations contain noises. The parameter estimation problems with more control variables will be much more sensitive to data noises, and the negative effects of noises can be restricted by reducing the number of control variables. This work provides a concrete evidence that ill-posedness of inverse problem can generate wrong parameter inversion results and produce an unreal “good data fitting.”

1. Introduction

The tides and tidal currents are the basic motion forms of ocean water and play an important role in the research on other processes, such as the storm surge, the circulation and the estuarine dynamics [1, 2]. For tidal models, open boundary conditions (OBCs) are one of the most important parameters, which are determined by the physics of tides and tidal currents. Therefore, how to obtain reasonable and accurate OBCs for regional tidal models has been a subject of ongoing research. Data assimilation methods have been commonly used to optimize the open boundary conditions [37].

Data assimilation methods, especially the complex ones like four-dimensional variational (4DVAR), are developed on the base of rigorous mathematical theories, such as inverse problem theory and optimal control theory. The ultimate purpose of applying data assimilation method is to reduce the data misfit between model results and various observations, by either improving the models or dynamically interpolating the observations. Among all the data assimilation methods, the 4DVAR is one of the most effective and powerful approaches. It is based on the optimal control methods and perturbation theory [8, 9]. This technique allows us to retrieve an optimal data for a given model from heterogeneous observation fields [9]. It is an advanced data assimilation method which involves the adjoint method and has the advantage of directly assimilating various observations distributed in time and space into numerical models while maintaining dynamical and physical consistency with the model. The adjoint method is a powerful tool for parameter estimation. Navon [10] presented an important overview on the state of the art of parameter estimation in meteorology and oceanography in view of application of 4DVAR data assimilation techniques to inverse parameter estimation problems. Zhang and Lu [7] studied the parameter estimation problems with a three-dimensional tidal model with 4DVAR and also summarized relative works. More recently, Kazantsev [9] briefly revealed the history of data assimilation starting from Lorenz’s pioneering work and then deeply studied the sensitivity of a shallow-water model to parameters by applying adjoint based technique.

For parameter estimation problems, it is of great importance to reasonably reduce the number of spatially varying control variables because of the ill-posedness of inverse problem. As noted by Yeh in the work of ground water flow parameter estimation, the inverse or parameter estimation problem is often ill-posed and beset by instability and nonuniqueness, particularly if one seeks parameters distributed in space and time domain [11]. The same viewpoint has been put forward by references [1216]. Consequently, how to reduce the number of parameters to be estimated became an important aspect needing to draw attention to [1317]. In this work two strategies for inverting the open boundary conditions with adjoint method are compared by carrying out semi-idealized numerical experiments. In the first strategy, the OBC is assumed to be partly space varying and generated by linearly interpolating the values at selected feature points. The feature points are selected by calculating the second-order derivatives of discrete curves and the values at selected feature points are taken as control variables to be estimated. The advantage is that most of the variations of the curves can be reproduced by the minimum number of points. In the second strategy, the OBC is assumed to be fully space varying and the values at every open boundary points are taken as control variables.

This paper is organized as follows. The 2D tidal model with adjoint is briefly described in Section 2. The two inversion strategies are developed in Section 3. A series of semi-idealized numerical experiments are carried out and the results are analyzed and discussed in Section 4. Conclusions in Section 5 complete the paper.

2. The Adjoint Tidal Model

2.1. The 2D Tidal Model

The governing equations for the tides used in the present study are the vertically integrated equations of continuity and momentum: where is time; and are the east longitude and north latitude, respectively; is the sea surface elevation above the undisturbed sea level; and are the east and north components of fluid velocity, respectively, is the adjusted height of equilibrium tides; is the radius of the earth, ; , where represents the angular speed of earth rotation; is the acceleration due to gravity, is the undisturbed water depth and denotes the total water depth; is the coefficient of horizontal eddy viscosity; is the Laplace operator and ; and are east and north components of bottom friction terms, respectively, and their expressions are given in quadratic form:

2.2. The Adjoint

The general idea of the adjoint method is described as follows. First, a model is defined by an algorithm and its independent variables such as initial conditions, boundary conditions, and empirical parameters. The cost function which measures the data misfit between the modeling results and observations is then minimized through optimizing the control variables. In detail, the cost function decreases along the opposite direction of the gradients with respect to the control variables, and this gradient is calculated by what has become known as the adjoint model. In order to construct the adjoint equations, the cost function is defined as and the Lagrangian function is defined as where is the observations of surface elevation; stands for the whole integration area of time and space; , , and are the adjoint variables (namely, Lagrangian multipliers) of , , and , respectively. Based on the theory of Lagrangian multiplier method, we have the following first-order derivates of Lagrangian function with respect to all the model variables:Equations (5b) give the original governing (1) and the adjoint equations can be developed from (5a). In (5c), and are the Fourier coefficients along the open boundary and denotes the bottom friction coefficients. From (5c) we can obtain the optimization formulae of model parameters.

Based on (5a) the adjoint equations can be obtained as where is a matrix whose components denote the adjoint terms of bottom friction. The components of for the quadratic parameterizations are given as The numerical schemes for the forward model and the adjoint model in this section are both based on Lu and Zhang [17] and Zhang et al. [18].

3. Methodology

3.1. Feature Points of a Curve

If the values of OBCs are plotted versus the location or index of grid points along open boundaries, they will form a discretized curve. Without loss of generality, the curve can be presented by Figure 1. Assume there are general (or, computational) points along open boundaries with index of , . This type of curve can be approximately linearly expressed by a certain series of points which are defined as feature points in this paper. For the curve shown in Figure 1, one can easily obtain the feature points as indicated by symbol “+.” Assume the number of feature points is with index of , . Further assuming the feature point with index of is coincident with the general point with index of , we can obtain the following relation: , , , .

It is easy to conclude that any general point can be linearly expressed by two adjacent feature points. For example, as shown in Figure 1, an arbitrary general point locates between two adjacent feature points and , where . Through linear interpolation, we can obtain the value of as

For the whole curve (or the whole boundary), the relation between general points and feature points can be similarly expressed in matrix form as where and are both column vectors with dimensions of and , respectively, and is the weighting matrix of linear interpolation with dimensions of . The detailed forms of three matrixes are given aswhere the nonzero components are the linear interpolation coefficients. Specifically, without loss of generality,Using (9), any general points along open boundaries can be highly approximated through the linear interpolation of selected feature points. It indicates that the OBC identification problem can be transformed to seek the values of a few selected feature points, which reduces the number of control variables.

3.2. Selection of Feature Points for Periodic Tidal Open Boundary

Along a certain open boundary, we also assume that there are general grid points. The height of water level at the th time step is given by where stands for the general points of open boundaries and , is the frequency of constituent, and are the Fourier coefficients at , is the time step of computation.

For regional tidal models the values of and can be obtained from large scale numerical models. It should be noted and are space dependent, and therefore the variations of their values versus the grids along the open boundary will constitute two curves (curve_ and curve_) similar to the one shown in Figure 1. The feature points for this type of curve can be selected by computing the second-order differential of each general point. The detailed selection procedures are given as follows.(1) Suppose the absolute values of second-order differentials of general points are SD_ for curve_ and SD_ for curve_, respectively. For the general points locating in the middle of curve_a and curve_b, that is, , SD_ and SD_ can be computed as where is the size of computation grids and equals or according to the direction of open boundaries ( for west-east direction and for north-south direction). (2) Further define that the “maximum second-order differential” for point is . The value of is calculated as (3) Define a threshold value of , , to be . The points with larger values of than are selected as feature points. The value of is problem dependent and should be determined according to the specific requirement on the number of control variables.(4) It is easy to understand that the first and the last general points and are automatically selected as feature points indexed as and .

3.3. Inversion Strategies and Gradients

In this work two strategies for inverting the open boundary conditions with adjoint method are compared by carrying out semi-idealized numerical experiments. In the first strategy the open boundary curves are assumed to be partly space varying and are generated by linearly interpolating the values at feature points. The feature points are selected by calculating the second-order derivatives of discrete curves and the values at selected feature points are taken as control variables to be estimated. The advantage is that most of the variations of the curves can be reproduced by the minimum number of points. In the second strategy, the OBC is assumed to be fully space varying and the values at every open boundary point are taken as control variables.

The Broyden-Fletcher-Goldfarb-Shanno (BFGS) method, which is a quasi-Newton conjugate-gradient algorithm, has been widely used in the unconstrained inverse problems and is famous for its efficiency [19, 20]. The limited-memory BFGS (L-BFGS) algorithm is an adaptation of the BFGS method to large problem. Zou et al. [20] concluded that among the tested quasi-Newton methods, the L-BFGS method had the best performance. In this work L-BFGS method is employed to optimize the control variables, namely, the OBCs. In order to perform inversion with L-BFGS, the gradients of cost function with respect to the control variables in two strategies have to be calculated.

3.3.1. Gradients for Partly Space Varying Inversion Strategy

In the first inversion strategy (partly space varying OBC), feature points for open boundary curves are selected and the OBCs at general points can be linearly interpolated from feature points. Consequently, the gradients of cost function with respect to the Fourier coefficients at feature points and ( and for simplicity, ) have to be computed in order to optimize the OBCs with L-BFGS. The gradients are deduced from which yields where where and are the adjoint variables of west-east velocity component and north-south velocity component , respectively. The values of and are computed by running the adjoint model.

3.3.2. Gradients for Fully Space Varying Inversion Strategy

In the second strategy, the OBC is assumed to be fully space varying and the values at every open boundary points (i.e., general points) are taken as control variables. Consequently, the gradients of cost function with respect to the Fourier coefficients at general points and ( and for simplicity, ) have to be computed. The gradients are deduced from which yields where can also be computed by using (19).

4. Numerical Experiments and Results Analysis

4.1. Model Settings

The computing area in the present study is the Bohai Sea, the Yellow Sea, and the East China Sea (BYECS), typical marginal shelf seas. The spatial resolution for the model is . altimeter data and tidal gauge data are assimilated into the tidal model. The bathymetry map of the BYECS, the position of satellite tracks, tidal gauge stations, and the open boundaries are shown in Figure 2. Since the purpose of this paper is to discuss the inversion of OBCs, the bottom friction coefficients are fixed in all the experiments.

The numerical experiments in this work are semi-idealized. Specifically, the coastline, the number, and location of the observations are real. On the contrary, the values of open boundary conditions and observations are artificial. The prescribed open boundary curves are generated by different number of feature points. Apparently, the complexity of open boundary curves is in direct proportion to the number of feature points. For the semi-idealized experiments, only the location of real observations (satellite altimetry and tidal gauge stations) is used and the values of “observations” are obtained by running the dynamic forward model with prescribed open boundary conditions. The advantage of this kind of experiments is that we can obtain a thorough understanding of the “observations.” The “observations” generated by the model can be accurate and we can control the quality of the “observations” by adding artificial error. In addition, because the other factors are real, the conclusions based on these semi-idealized experiments can be more useful for referring.

The semi-idealized numerical experiments are run as follows. First a distribution of artificial Fourier coefficients is prescribed and taken as “true values” of open boundary conditions. Then the forward tidal model is run using the “true values” and the simulation results recorded at grid points of satellite tracks and tidal gauge stations are taken as the “observations.” Having obtained the “observations”, an initial value (taken as zero in this work) of Fourier coefficients is assigned to run the forward model. The differences between simulated values and “observations” will function as the external force to drive the adjoint model. The optimized Fourier coefficients can be obtained through the backward integration of the adjoint equations. The inverse integral time of the adjoint equations is equal to a period of tide. With the procedures repeated above, the parameters will be optimized continuously and the difference between simulated values and “observations” will be diminished. Meanwhile, the difference between the prescribed and the inverted parameters will also be decreased.

The iteration of optimization will terminate once the following criterion is achieved [21]: where is the norm of the gradients of cost function with respect to the control variables (i.e., the Fourier coefficients at feature points), is a positive variable that determines the accuracy with which the solution is to be found, and is the norm of control variables. Both the values of and vary along the iterations. For a correct adjoint model and a reasonable method, will gradually decrease versus the iteration steps and the inverted values of control variables must gradually approach the prescribed “true values”. When using L-BFGS, the number of corrections used in the BFGS update is taken as 5 (usually between 3 and 7, see Alekseev et al. [19]). In the minimization algorithm, the control variables should be scaled to similar magnitudes on the order of unity because within the optimization algorithm convergence, tolerances, and other criteria are based on an implicit definition of small and large [22]. Zou et al. [20] also proved that the efficiency could be greatly improved by a simple scaling. In twin experiments we use 10 to scale the Fourier coefficients [4].

4.2. Modeling Results
4.2.1. Effects of Complexity of Open Boundary Curves

In this section, the semi-idealized experiments (SE) are carried out to calibrate the inversion ability of adjoint model and compare the effectiveness of two strategies developed in Section 3. The prescribed distributions of artificial Fourier coefficients at 173 grid points along the eastern open boundary are inverted. The prescribed distributions (PDs) are designed to be characterized by different numbers of feature points. PDs 1–7 are characterized by 2, 6, 10, 14, 18, 22, and 26 feature points, respectively. The twin experiments are correspondingly indexed with SEa 1–7 for inversion strategy 1 and SEb 1–7 for inversion strategy 2.

The prescribed and inverted distributions of open boundary curves in SEa 1–4 and SEb 1–4 are shown in Figure 3. The prescribed and inverted distributions of open boundary curves in SEa 5-6 and SEb 5-6 are shown in Figure 4. The feature points for prescribed distributions have also been indicated in Figures 3 and 4. Table 1 gives the error statistics for the experiments in this section. The norm of the gradients of cost function with respect to the control variables versus the iteration steps for the experiments using inversion strategies 1 and 2 are presented in Figures 4(c) and 4(d), respectively. The decrease in data misfit (i.e., cost function) calculated from (3) versus the iteration steps is shown in Figure 5. Note that the values of data misfit and norm of gradients have been normalized by their values at the first iteration step.

For strategy 1, the values of data misfit can sharply decrease by about 4 orders for all the experiments in about 30 iteration steps. For strategy 2, the values of data misfit can sharply decrease by about 5 orders for SEb 1–5 and by 4 orders for SEb 6-7 in about 60 iteration steps. The decrease in data misfit provides another proof for the inversion ability of the adjoint model and strategies in this work. Correspondingly, the norms of gradients also decrease by at least 2 orders for inversion strategy 1 and by 3 orders for inversion strategy 2, which demonstrates that the gradients calculated in Section 3.3 can work well with L-BFGS method.

From the decrease in data misfit and gradient it seems as if the effect of inversion strategy 2 is better than that of strategy 1. However, the differences between prescribed and inverted distributions shown in Table 1 indicate that the inversion results of strategy 1 are much better than those of strategy 2. This inconsistency will be explained in Section 4.3. One can find that the adjoint model combined with inversion strategy 1 can reproduce the prescribed distributions of Fourier coefficients perfectly for SEa 1-2 or almost perfectly for SEa 3-4. For SEa 5-6 the inversion is acceptable but largely deviates from perfection. The major trend of the inversion is quite obvious that the effect of inversion is in inverse proportion to the number of feature points which characterizes the complexity of open boundary curves. The inverted open boundary curves shown in Figures 3 and 4 also prove that the inversion using strategy 1 is better than that using strategy 2.

4.2.2. Effects of Data Noises

As we know, the real observations either from satellite altimetry or from tidal gauge stations contain errors (or noises). In this section the effects of the noises are studied. To do this, we replace each “observation” by , where are uniform random numbers lying in and is a factor determining the maximum percentage error. The maximum percentage errors for each prescribed distribution (PDs 1–7) are assigned to 5%, 10%, 15%, and 20%. The corresponding inversion experiments are then indexed with SEx i.1, SEx i.2, SEx i.3, and SEx i.4, respectively, where and or . The error statistics for the experiments with values of 5%, 10%, 15%, and 20% are exhibited in Tables 2, 3, 4, and 5, respectively. The figures are omitted because they are similar to those in Section 4.2.1.

One can find the noises in artificial observations will significantly and negatively influence the inversion of open boundary conditions. It is clear that the inversion using strategy 2 is much more sensitive to the noise than that using strategy 1. For example, when the simplest distribution PD 1 is inverted, the difference between prescribed and inverted values will sharply increase from 0.0101 (Table 1) to 0.0238 (Table 2) for strategy 2 even with a small value of error 5%. When was increased to 20%, the value of this difference is also increased to 0.0562 (Table 5). However, for strategy 1 the values of this difference are just 0.0011, 0.0011, 0.0032 and 0.0043 under value of 5%, 10%, 15%, and 20%. Similar results can be found from the inversion results of other distributions. This phenomenon indicates that the effect of ill-posedness of inverse problem will be amplified in the conditions that observations contain noises. In addition, the parameter estimation problems with more control variables will be much more sensitive to data noise and the negative effect of noises can be restricted by reducing the number of control variables.

4.3. Discussions
4.3.1. Rationality of the Adjoint Method (Suggested by an Anonymous Reviewer)

The motivation of the present work is to take the open boundary condition as an example to investigate the performance of the adjoint method when applied to ocean modeling and the ill-posedness of relevant inverse problem. The inverse problems in ocean models are often quite complex. The ocean modeling is not just to solve the partial differential equations which might also be solved by some simple methods like the method of characteristics. A reasonable ocean model should also be related to the field observations (satellite altimetry and tidal gauges in this work). In order to realize a more accurate simulation of ocean dynamics, how to organically combine the numerical ocean model with available observations has already become a problem urgent to be solved. Data assimilation methods have been used widely to solve this problem. Among all data assimilation methods, the adjoint data assimilation method is one of the most effective and powerful approaches developed over the past three decades. It is an advanced data assimilation method and has the advantage of directly assimilating various observations distributed in time and space into the numerical model while maintaining dynamical and physical consistency with the model. The adjoint method might be complicated and expensive for some simple problems. However, the inverse problems in ocean modeling are often quite complex in contrast with those simple problems. As is known, one advantage of the numerical method over theoretical analysis lies in the disposal of nonlinear terms. The ocean numerical models are usually strongly nonlinear, increasing the complexity of the relevant inverse problem. Therefore, the increased complexity of the inverse problem makes the adjoint method effective. The adjoint method has been proved to be effective and powerful in ocean and atmosphere problems by many works (see the references listed in Section 1). It has been widely applied to meteorological and oceanographic data assimilation, sensitivity studies, and parameter estimation.

4.3.2. Analysis on Ill-Posedness

From the statistics shown in Tables 15, we can find an interesting phenomenon. Define the data misfits after assimilation to be for inversion strategy 1 and for inversion strategy 2. Further define the differences between prescribed and inverted control variables to be for inversion strategy 1 and for inversion strategy 2. The values of and for all the experiments are plotted in Figure 6. We can find are larger than or comparable with while are greatly smaller than . Consequently, for all the experiments except SEa 1 and SEa 2, without loss of generality, we can obtain It is easy to understand that small values of indicate more accurate control variables, and small values of mean small differences between simulated and observed results. In this work, the open boundary conditions are the only parameters for estimation and other parameters are fixed all the time. Instead of formula (23), we should have expected which means a better parameter estimation drives a more accurate simulation. In other words, what we want are small values of and what we need are small values of . Formulas (23) and (24) exactly indicate an inconsistency between the effects of parameter estimation and observation restricted data reproduction.

For PDs 1–7 the numbers of feature points are 2, 6, 10, 14, 18, 22, and 26, respectively. It should be noted that at each feature point the Fourier coefficients include and . Therefore the numbers of control variables for inversion are doubled, that is, 4, 12, 20, 28, 36, 44, and 52, respectively. There are a total of 35 semi-idealized experiments in this work. Among these experiments, only SEa 1 and SEa 2 can realize a perfect inversion of control variables. Here we define perfect inversion as follows: the data misfit between observed and simulated values can decrease to zero and the difference between prescribed and inverted control variables can also reach a value of zero. With more control variables and larger data noises, the inversion results will not be exactly equal to the prescribed distributions. In the work of Smedstad and O’Brien [12] where the spatially distributed phase speed in an equatorial Pacific Ocean model was estimated, they could not produce the exact values either, even in the condition that perfect observations were available at every grid of the model. Zhang and Lu [4] put forward the similar viewpoint and it also occurs in the parameter estimation of internal tidal model [2325]. With identical twin experiments, the “observations” are perfect in the sense that they are produced by the model and thus are consistent with the model physics. From the results of this paper and previous works, we can conclude that ill-posedness has happened in other 33 experiments and the effects of ill-posedness will be amplified by increasing the number of control variables and data noises. Formula (23) obtained in this work provides a concrete evidence that ill-posedness of inverse problem can generate poor parameter inversion results while producing an unreal “good data fitting”. For a specific problem, it is necessary and helpful to perform identical semi-idealized experiments in order to find the optimal choices for the number of control variables and inversion strategy.

5. Conclusions

In this work, two strategies for inverting the open boundary conditions with adjoint method are compared by carrying out semi-idealized numerical experiments. In the first strategy, the open boundary curves are assumed to be partly space varying and are generated by linearly interpolating the values at feature points. The feature points are selected by calculating the second-order derivatives of discrete curves and the values at selected feature points are taken as control variables to be estimated. The advantage is that most of the variations of the curves can be reproduced by the minimum number of points. In the second strategy, the OBC is assumed to be fully space varying and the values at every open boundary points are taken as control variables.

A series of semi-idealized experiments are carried out to calibrate the inversion ability of adjoint model and compare the effectiveness of two inversion strategies. The results demonstrate that the effect of inversion is in inverse proportion to the number of feature points which characterize the complexity of open boundary curves. The effect of ill-posedness of inverse problem will be amplified in the conditions that observations contain noises. The parameter estimation problems with more control variables will be much more sensitive to data noises and the negative effects of noises can be restricted by reducing the number of control variables. This work provides a concrete evidence that ill-posedness of inverse problem can generate wrong parameter inversion results while producing an unreal “good data fitting”. For a specific problem, it is necessary and helpful to perform identical semi-idealized experiments in order to find the optimal choices for the number of control variables and inversion strategy.

Acknowledgments

The authors thank Professor Jorge Nocedal at Northwestern University for sharing the source codes of L-BFGS. Partial support for this research was provided by the National Natural Science Foundation of China through Grants 41206001 and 41076006, the Major State Basic Research Development Program of China through Grant 2013CB956500, the Natural Science Foundation of Jiangsu Province through Grant BK2012315, the Priority Academic Program Development of Jiangsu Higher Education Institutions, and the Fundamental Research Funds for the Central Universities 201261006.