Research Article | Open Access
Nicholas Ampazis, Stavros J. Perantonis, "An Efficient Constrained Learning Algorithm for Stable 2D IIR Filter Factorization", Advances in Artificial Neural Systems, vol. 2013, Article ID 292567, 7 pages, 2013. https://doi.org/10.1155/2013/292567
An Efficient Constrained Learning Algorithm for Stable 2D IIR Filter Factorization
A constrained neural network optimization algorithm is presented for factorizing simultaneously the numerator and denominator polynomials of the transfer functions of 2-D IIR filters. The method minimizes a cost function based on the frequency response of the filters, along with simultaneous satisfaction of appropriate constraints, so that factorization is facilitated and the stability of the resulting filter is respected.
Factorization of 2D polynomials is an important problem in the design of IIR filters in two-dimensional signal processing because a factorized filter transfer function can be efficiently implemented in cascade form. However, the fundamental theorem of Algebra, concerning polynomial factorization, cannot be extended to the two-dimensional case; that is, a bivariate polynomial cannot, in general, be factored as a product of lower-order polynomials. The 2D IIR cascade structure may be considered as an attractive circuit design alternative, because of its relative insensitivity to coefficient quantization, the requirement of smaller number of arithmetic operations for a given filter size, and finally, because issues related to the stability of the filters are easier to deal with for filters with a smaller number of denominator coefficients . Hence, an efficient method for the factorization of 2D IIR filters would be most beneficial.
Many methods have been proposed for dealing with the two-dimensional polynomial factorization problem [2–6]. Most of those methods, however, adopt a conventional numerical method of roots finding (e.g., Laguerre’s, Newton-Raphson’s, Jenkins-Traub’s methods, etc.), in which successive approximations to the roots are obtained. In addition, almost all numerical methods can only find the roots one by one, that is, by deflation method, where the next root is obtained by the deflated polynomial after the former root is found. This means that numerical root-finding methods are inherently sequential.
In an earlier work we have proposed a neural network-based Constrained Learning Algorithm (CLA) for factorizing 2D polynomials using constrained optimization techniques . Using that approach we were able to obtain exact solutions for factorable polynomials and excellent approximate solutions for nonfactorable polynomials. The technique offers the advantage of incorporating a priori information about the relations between the coefficients of the original polynomial and the coefficients of the desired factor polynomials. By incorporating additional stability constraints into the formalism, the stability (in the sense of Huang ) of a transfer function with the resulting factorized polynomial in the denominator (of up to second order) is also made possible. Our CLA method has been explored further in  by utilizing root moments, in  with an adaptive neural network architecture, and in  for handling complex root cases.
Despite its success in 2D polynomial factorization, CLA suffers from disadvantages which can limit its effectiveness for its application to IIR filter factorization. CLA’s cost function is not directly related to a proximity measure between the frequency response characteristics of the filters, but is based on the Mean Square Error (MSE) criterion between the output of the original polynomial and the output of the resulting factorable polynomial for various real values of and . Thus, CLA requires separate factorizations for the numerator and denominator polynomials of the IIR filter, while better solutions, in terms of the proximity of filter characteristics, could be achieved by simultaneous variation of all available parameters (numerator and denominator coefficients). In this paper we apply constrained optimization techniques to the more general problem of implementing a stable cascade IIR filter, given the specifications of an original 2D filter. This is achieved by suitably modifying our previous formalism as it is explained in the following section.
2. Derivation of the Algorithm
Consider the polynomial
with even and . For the above polynomial, we seek to achieve an exact or approximate factorization of the form where with .
To this end, we can utilize a feedforward neural network architecture in order to learn the appropriate weights as we proposed in . For example, Figure 1 shows the architecture of such a network for the case where . The units in the hidden layer have a logarithmic activation function , and the output unit is linear. This architecture has been preferred to the - architecture employed in , because it produces smoother cost function landscapes avoiding deep valleys and thus facilitates learning. As we showed in , this neural network can be trained using a Constrained Learning Algorithm (CLA) that achieves minimization of the MSE criterion along with simultaneous satisfaction of multiple equality and inequality constraints between the polynomial coefficients. Using that method, we are able to obtain very good approximate solutions for nonfactorable polynomials.
Now consider the 2D IIR filter transfer function where with , even and .
For the above polynomials, we seek to achieve an exact or approximate factorization of the form where where with , and so that the transfer function of (4) can be approximated by
Unfortunately, if it were to apply CLA in this problem we would have to derive two separate factorizations for the numerator and denominator polynomials. Such an optimization scheme, however, is not directly related to a proximity measure between the original and the factorized filter frequency response characteristics. In terms of the proximity of filter characteristics, obviously a better approximation can be achieved by a simultaneous variation of both the numerator and denominator coefficients.
In this case, we should ask: in what sense are and similar? The answer to this question is that we would at least expect the amplitude spectra of and to be roughly equal so that the specifications of the original and of the factorized filter match as close as possible. Hence, the main purpose of the factorization algorithm is to minimize an MSE cost function based on the magnitude of the frequency responses of the original and the resulting filter: The double summation is carried over a regular grid of points. Logarithmic functions are used also in this case in order to suppress deep valleys and thus facilitate convergence. However, is still a highly nonlinear function of the polynomial coefficients, and therefore gradient-based techniques can run into problems in the presence of valleys or ridges in the cost function landscape. For such cases it has proved beneficial to incorporate into the adaptation rule additional a priori problem-specific information. This often facilitates the convergence of the algorithm and leads to better solutions.
For the problem at hand, additional information is available in the form of constraints between the coefficients of the given numerator and denominator polynomials and the coefficients of the corresponding desired factor polynomials. More explicitly, if we assume that and are factorable, then these constraints can be expressed as follows: Thus, for the general case in which and are nonfactorable, the objective of the adaptation process is to reach a minimum of the cost function of (11) with respect to the variables and , which satisfies as best as possible the constraint , where with
The strategy which will be followed in order to solve this problem is similar to the method we proposed in , but with appropriate modifications in order to take care of the above constraints, as well as the new cost function that we introduced in (11).
At each iteration of our algorithm, the vector of unknown coefficients is changed by , so that where is a small constant. Thus, at each iteration, the search for an optimum new point in thespace is restricted to a small hypersphere centered at the point defined by the current vector . If is small enough, the changes to and toinduced by changes in the coefficients can be approximated by the first differentials and . Also, at each iteration, we seek to achieve the maximum possible change in , so that (14) is respected and the change in is equal to a predetermined vector quantity , designed to bringcloser to its target (zero).
Maximization of can be carried out analytically by introducing suitable Lagrange multipliers. Thus, a vectorof Lagrange multipliers is introduced for the constraints in (12) and a multiplier is introduced for (14).
Consider a function , whose differential is defined as follows: On evaluating the differentials involved in the right-hand side, we readily obtain
where(i) is a vector with elements ; (ii)is a matrix given by ; (iii) andare matrices with elements and , respectively. To minimize (maximize its magnitude) at each iteration, we demand that
Noting that we can multiply both sides of (18) by and solve for to obtain To evaluate , we substitute (18) and (19) into (14) and obtain Note that due to the block-diagonal form of matrixit is possible to avoid the tedious evaluation of by rewriting the quantities , , and as follows: (i), where and ,(ii), (iii).
We next discuss the choice of . It is natural to choose the elements of proportional to with a negative constant of proportionality , , so that the constraints move towards zero at an exponential rate. Then, the bound set on the value of by (20) forces us to choose adaptively. The simplest choice for will be used; namely, + , where . Thus, the final update rule has only two free parameters, namely, and .
3. Stability Constraints
For the filter defined in (4), it is reasonable to assume that for most practical purposes the numerator polynomial does not affect the system stability . In the important special case of a second-order denominator polynomial , the stability conditions according to Huang  for the factor polynomials can be written as , as shown in Table 1. These conditions can be incorporated in the formalism as follows: at a certain iteration of the algorithm, let a number of the stability constraints be violated, so that . We can augment the vector of the previous section by adding extra components . By following the formalism of Section 2 using the augmented vector, the quantities are driven towards the stability region. Note that the dimensionality of the vectoris adaptive, as only the violated constraints are included in each iteration. Iterative application of this scheme can lead to a final stable solution.
4. Experimental Results
We seek an approximate factorization of the second-order 2D IIR filter whose numerator and denominator polynomial coefficients are shown in Tables 2 and 3, respectively. Note that both polynomials are nonfactorable.
Figure 2(a) shows the amplitude spectrum of the original low-pass IIR filter, and Figure 3 shows the error curve of the unconstrained gradient descent method applied to (11) using a regular grid of points with an adaptation gain of . From the latest figure we can see that after an initial error reduction, an undesirable flat minimum is reached followed by unbounded oscillations caused by a transition into a region of the parameter space in which the denominator polynomial is unstable. The amplitude spectrum of the filter corresponding to the factorization result achieved at the flat minimum is shown in Figure 2(b) and is clearly unacceptable.
Using the constraints , , , , , and and all stability constraints of Section 3, we have applied both CLA, as described in , and the method proposed in the present paper using a regular grid of training points. Figure 2(c) shows the amplitude spectrum of the filter obtained from the original CLA approach (with and ) applied sequentially to the numerator and denominator polynomials. Obviously CLA outperforms gradient descent in all aspects. However, there are still small (but not negligible) differences between the amplitude spectra shown in Figures 2(a) and 2(c).
Figure 2(d) shows the amplitude spectrum of the filter obtained by the proposed method, with and . Clearly, there is very good agreement between the amplitude spectra of Figures 2(a) and 2(d). In order to visualize this better, in Figure 4, we present the differences in the amplitude spectra between the original filter and each of the filters obtained by gradient descent (Figure 4(a)), CLA (Figure 4(b)), and the proposed method (Figure 4(c)). From this figure it is clear that, of all the methods, the one proposed in the present paper achieves the maximum proximity to the original amplitude spectrum, leading to the design of a cascade filter that meets as closely as possible the specifications of the original nonfactorable filter.
The factorization results are summarized in Tables 2 and 3. Note that the separate factorization of the CLA approach produces coefficients that are closer to those of both the numerator and denominator polynomials of the original filter, while the proposed method achieves maximum similarity between the frequency responses in expense of a smaller coefficient similarity, distributed along both the numerator and denominator coefficients.
In this paper, an efficient algorithm was proposed for factorizing simultaneously the numerator and denominator polynomials of the transfer functions of 2D IIR filters. The method is useful for designing IIR filters in cascade form for more efficient hardware implementation. By introducing a cost function based on the difference of frequency responses between the original and cascade filters and using a flexible framework for incorporating additional information in the form of constraints, we obtained optimal stable solutions for nonfactorable filters.
- J. S. Lim, Two-Dimensional Signal and Image Processing, Prentice-Hall International, 1990.
- Z. Mou-Yan and R. Unbehauen, “On the approximate factorization of 2-D polynomials,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 35, no. 4, pp. 577–579, 1987.
- N. E. Mastorakis, N. J. Theodorou, and S. G. Tzafestas, “A general factorization method for multivariable polynomials,” Multidimensional Systems and Signal Processing, vol. 5, no. 2, pp. 151–178, 1994.
- P. Misra and R. V. Patel, “Simple factorizability of 2-D polynomials,” in Proceedings of the International Symposium on Circuits and Systems, pp. 1207–1210, New Orleans, La, USA, 1990.
- B. C. Langand Frenzel, “Polynomial root finding,” IEEE Signal Processing Letters, vol. 1, no. 10, pp. 141–143, 1994.
- L. Hoteit, “FFT-based fast polynomial rooting,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'00), vol. 6, pp. 3315–3318, Istanbul, Turkey, June 2000.
- S. Perantonis, N. Ampazis, S. Varoufakis, and G. Antoniou, “Constrained learning in Nneural networks: application to stable factorization of 2-D polynomials,” Neural Processing Letters, vol. 7, no. 1, pp. 5–14, 1998.
- T. S. Huang, “Stability of two-dimensional recursive filters,” IEEE Transactions on Audio and Electroacoustics, vol. 20, no. 2, pp. 158–163, 1972.
- D. S. Huang, H. H. S. Ip, and Z. Chi, “A neural root finder of polynomials based on root moments,” Neural Computation, vol. 16, no. 8, pp. 1721–1762, 2004.
- D. S. Huang, H. S. I. Horace, C. K. L. Ken, Z. Chi, and H. S. Wong, “A new partitioning neural network model for recursively finding arbitrary roots of higher order arbitrary polynomials,” Applied Mathematics and Computation, vol. 162, no. 3, pp. 1183–1200, 2005.
- D. S. Huang, “constructive approach for finding arbitrary roots of polynomials by neural networks,” AIEEE Transactions On Neural Networks, vol. 15, no. 2, pp. 477–491, 2004.
- R. Hormis, G. Antoniou, and S. Mentzelopoulou, “Separation of two- dimensional polynomials via a sigma-pi neural net,” in Proceedings of the International Conference on Modelling and Simulation, pp. 304–306, Pittsburg, Pa, USA, 1995.
Copyright © 2013 Nicholas Ampazis and Stavros J. Perantonis. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.