Mathematical Problems in Engineering

Volume 2017 (2017), Article ID 3818949, 11 pages

https://doi.org/10.1155/2017/3818949

## Least Square Support Tensor Regression Machine Based on Submatrix of the Tensor

College of Mathematics and Systems Science, Xinjiang University, Urumqi 830046, China

Correspondence should be addressed to Zhi-Xia Yang

Received 14 March 2017; Revised 10 October 2017; Accepted 15 October 2017; Published 9 November 2017

Academic Editor: Gisella Tomasini

Copyright © 2017 Tuo Shu and Zhi-Xia Yang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

For tensor regression problem, a novel method, called least square support tensor regression machine based on submatrix of a tensor (LS-STRM-SMT), is proposed. LS-STRM-SMT is a method which can be applied to deal with tensor regression problem more efficiently. First, we develop least square support matrix regression machine (LS-SMRM) and propose a fixed point algorithm to solve it. And then LS-STRM-SMT for tensor data is proposed. Inspired by the relation between photochrome and the gray pictures, we reformulate the tensor sample training set and form the new model (LS-STRM-SMT) for tensor regression problem. With the introduction of projection matrices and another fixed point algorithm, we turn the LS-STRM-SMT model into several related LS-SMRM models which are solved by the algorithm for LS-SMRM. Since the fixed point algorithm is used twice while solving the LS-STRM-SMT problem, we call the algorithm dual fixed point algorithm (DFPA). Our method (LS-STRM-SMT) has been compared with several typical support tensor regression machines (STRMs). From theoretical point of view, our algorithm has less parameters and its computational complexity should be lower, especially when the rank of submatrix is small. The numerical experiments indicate that our algorithm has a better performance.

#### 1. Introduction

As we all know, in the past decades, matrices or more generally multiway arrays (tensors) types of data have an increasing number of applications. For example, all raster images are essentially digital readings of a grid of sensors and matrix analysis is widely applied in image processing, for example, photo realistic images of faces [1], palms [2], and medical images [3]. In web search, a large amount of tensors that stand for images [4] can be found easily. Therefore, tensor data analysis [5], particularly regression problem [6, 7], has become one of the most important topics for face recognition [8], palmprint recognition [9], and so on.

Tensor types of data have greatly drawn the attention of people. Recently, several tensor learning for regression approaches [10, 11] appears, but the majority of them dealing with tensor regression problems work on vector spaces that are derived by stacking the original tensor elements in a more or less arbitrary order. This vectorization of data causes many new problems. First, the structural information is destroyed. Second, the vectorization of a tensor may bring an extremely high dimensionality vector which may lead to high computational complexity, overfitting, and large memory requirement. The rest of the methods mainly take advantage of the decomposition of a matrix [12] or tensor [6], which can reduce the high computational complexity as well as high dimensionality at the expense of slight decline of accuracy, but the structural information is destroyed totally. So a more reasonable method which can reserve the underlying structural information and avoid overfitting, high dimensionality, and high computational complexity is needed.

Considering the fact that a colorful photograph can be expressed as a third-order tensor, of which each frontal slice is indeed a gray image that contains almost all the information of the colorful photograph, we can take advantage of this by introducing submatrix of a tensor and abstract vector space when solving tensor learning for regression problems. That means, each tensor data sample can be regarded as an abstract vector [13] whose elements are submatrix types of features. Gathering together the same feature information of different tensor data sample, we can construct new submatrix training sets and the same number of related training models, from which we can get an equal amount of weight submatrix. And then, the weight tensor is obtained. Besides, we improve the fixed point algorithm [14] via some projection matrices [15], including a series of left projection matrices and a right projection matrix. The improved algorithm is called dual fixed point algorithm (DFPA). The projection matrices can not only join the training models up but also reduce computational complexity and large memory requirement. That is to say, we turn the LS-STRM-SMT problems into solving a battery of least square support matrix machine (LS-SMRM) by fixing the projection matrices and then solve the LS-SMRMs problems with fixed point algorithm. The numerical experiments indicate that our method and algorithm have a better performance.

The paper is organized as follows: in Section 2, notations and preliminaries are introduced, such as definitions related to tensors and notation that will be used. In Section 3, we propose our (LS-SMRM) for matrix regression problems and the fixed point algorithm for them. In Section 4, we propose the LS-STRM-SMT models and develop the DFPA to solve them. Computational comparisons on both UCI data sets and artificial data are done in Section 5 and conclusions in Section 6.

#### 2. Notations and Preliminaries

Here, we will give a brief description of the notations that will be used in the later sections. More specifically, boldface capital letters, for example, , boldface lowercase letters, for example, , and lowercase letters, for example, , are used to denoted matrices, vectors, and scalars, respectively. Tensors regarded as multidimensional arrays will be denoted by Euler script calligraphic letters, for example, , where denotes the order of the tensor. Inspired by the fact that th element of a vector is denoted by , the elements of an -order tensor will be denoted by .

For an -order tensor , the -mode matricization also known as unfolding or flattening is denoted byIt is quite clear that we can reorder the elements of the tensor into a matrix in such a way. On the contrary, define a mapping functionto recover a tensor from its unfolding matrix. Particularly, when , we haveThe inner product of the two same size tensors , is defined asThe Frobenius norm of a tensor is thus defined as And it can be shown that The Contrast Pyramid, referred to as CP, decomposition factorizes an order tensor into a linear combination of rank-one tensors, written aswhere the operator is the outer product of vectors and the factor matrix , of the size . For the convenience, the mentioned tensor is the third-order tensor in the following content if there is no special instruction.

#### 3. Least Square Support Matrix Regression Machine (LS-SMRM)

In the section, we propose least square support matrix regression machine, shorten as LS-SMRM, for the regression problem with matrix input.

Give a training setwhere is the input and is the output, . Our task is to find a predictorwhere denotes the weight matrix and is the bias. For a new input matrix, we can predict its output through the above-mentioned predictor.

In order to get predictor (9), we develop the following optimization problem:where is penalty parameter.

According to the CP decomposition (7), the matrices and can be found to makewhere , and . Then, optimization problem (10) can be turned into as follows:The fixed point algorithm is applied to solve optimization problem (12). When is fixed, we need to compute a set of for and . Firstly, denoteand the optimization problems (12) are equivalent toAnd letoptimization problems (14) are reformulated asThe Lagrange function of optimization problems (16) can be expressed aswhere is Lagrangian multiplier vector. Then, the KKT system of (17) isRewrite (18), (19), and (20) into the form of equation; we can getwhere and can be got by solving linear system (21). Then, is obtained according to (18) and the right projection matrix is accessed through the relation among , (13) and (15). In summary, when we fix , the solution of optimization problems (12) can be computed by solving linear system (21) directly.

Similarly, when is fixed, we can also change the formulation of our algorithm in optimization problems (12) and derive its optimal and by solving another linear system. That is to say, we need to compute a set of for and . Firstly, denoteoptimization problems (12) is equivalent toAnd letthe optimization problems (24) are reformulated asThe Lagrange function of optimization problems (26) can be expressed aswhere is Lagrangian multiplier vector. Then, the KKT system of (27) isRewrite (28), (29), and (30) into the form of equation; we can getwhereRepeating the iterative operation until convergence, the weight matrix** W** is obtained by (11), and the predictor is According to the above description, we can summarize the following algorithm.

*Algorithm 1 (LS-SMRM). * *(**1) Training Process**Input.* Training set (8). *Output.* Left and right projection matrices , and bias . (a)Initialize .(b)Formulate by (13).(c)Alternatively update and until convergence.(1)Update and .(1.1)Get and by solving linear system (21).(1.2)Get from (19).(1.3)Get from (15).(1.4)Get from (13).(2)Update and .(2.1)Get and by solving linear system (31).(2.2)Get from (28).(2.3)Get from (25).(2.4)Get from (23).*(**2) Testing Process**Input.* Testing point , left and right projection matrices , and bias . *Output.* The value of testing set:

#### 4. LS-STRM Based on Submatrix of the Tensor (LS-STRM-SMT)

In this section, we propose least square support tensor regression machine, shorten as LS-STRM, for the regression problem with the tensor input. In fact, with the introduction of submatrix of the tensor, the LS-STRM problem is turned into LS-SMRM problems which are independent. However, the right project matrices should be made equal to fit the practical situation. That is to say, we need to solve LS-SMRM problems with the same right projection matrix. To show the effectiveness of the proposed algorithm, we will provide some deep analyses in the section.

##### 4.1. Formulation

Give a training setwhere is the input and is the output, . Our task is to find a predictorwhere denotes the weight tensor and is the bias. For a new input tensor, we can predict its output through predictor (36). For convenience, we set .

*Definition 2. *For an -order tensor , the submatrix of the tensor is defined asParticularly, when , the submatrices of a third-order tensor are indeed the frontal slices.

However, we do not construct the model for training set (35) directly. We transform the training set (35) into training sets similar to training set (8) by the introduction of the submatrix of the tensor and then construct regression problems. That means, for the training set (35), every input tensor , can be regarded as an abstract vectorwhere , is the th frontal slice of the tensor . Next, we take the th frontal slice of each tensor , out and construct the following training set:where denotes the th element of , is a vector obey normal distribution with mean and variance .

According to training set (39), optimization problems are constructed as follows:where , So we can get , and then the weight tensor can be obtained by It is clear that the model is independent, but this is contrary to the truth. In order to reflect the relation among them, we set so that the models can fit the practical situation better. The models can be expressed together as follows:where , That means what we need to solve are optimization problems (43).

##### 4.2. Dual Fixed Point Algorithm (DFPA) for LS-STRM-SMT

Fixing and optimizing , optimization problems (43) can be reformulated as follows:where .

That is to say, a series of problems similar to optimization problems (44) which are indeed LS-SMRMs rather than optimization problems (43) are needed to be solved.

Fixing and optimizing . Similarly, when , is fixed, optimization problems (43) can be reformulated as follows:where It is clear that can be obtained by solving optimization problems (45)-(46) with Algorithm 1. This will lead Algorithm 3.

*Algorithm 3 (LS-STRM-SMT). * *(**1) Training Process**Input.* Training set (35). *Output. *, and . (a)Construct the number of new submatrix training sets (39).(b)Give a positive integer .(c)Initialize and .(d)Alternatively update and until convergence.(1)Update , by solving optimization problems (44) with Algorithm 1.(2)Update by solving optimization problems (45)-(46) with Algorithm 1.*(**2) Testing Process**Input.* Testing point , , and *Output.* The value of testing point:

##### 4.3. Extension

For a more general tensor, that means . We can also take advantage of the submatrix of the tensor to turn the tensor learning for regression problems into LS-SMRM problems that can be solved by Algorithm 1. The details can be shown as follows.

Give a training setwhere is the input and is the output, . Our task is to find a predictorwhere denotes the weight tensor, is the bias. For a new input tensor, we can predict its output through predictor (50). The new training set is constructed through (1) as follows:where denotes the th submatrix of the th sample, denotes the th element of which is a -dimension vector obey normal distribution with mean and variance . Then we can get a -dimension abstract vector whose elements are matrices of the size . According to mapping function (2), the matrices can be reformulated as equal number of ()-order tensors. And then the abstract vector with ()-order tensor elements is indeed a -order tensor that we need to get.

#### 5. Numerical Experiment

In the following numerical experiments, we use four groups of vector data from the UCI database and an artificial tensor data for evaluation of our algorithm. The data are Slump, Ticdata2000, ConcreteData, and BlogData from the UCI database. The artificial data is given by the function “rand” in Matlab. We reformulate the vector data into matrices or tensors by rearranging the order of vector’s elements. The detailed statistical characters are listed in Table 1.