Abstract
Compressive sensing has emerged as an area that opens new perspectives in signal acquisition and processing. It appears as an alternative to the traditional sampling theory, endeavoring to reduce the required number of samples for successful signal reconstruction. In practice, compressive sensing aims to provide saving in sensing resources, transmission, and storage capacities and to facilitate signal processing in the circumstances when certain data are unavailable. To that end, compressive sensing relies on the mathematical algorithms solving the problem of data reconstruction from a greatly reduced number of measurements by exploring the properties of sparsity and incoherence. Therefore, this concept includes the optimization procedures aiming to provide the sparsest solution in a suitable representation domain. This work, therefore, offers a survey of the compressive sensing idea and prerequisites, together with the commonly used reconstruction methods. Moreover, the compressive sensing problem formulation is considered in signal processing applications assuming some of the commonly used transformation domains, namely, the Fourier transform domain, the polynomial Fourier transform domain, Hermite transform domain, and combined timefrequency domain.
1. Introduction
The fundamental approach for signal reconstruction from its measurements is defined by the ShannonNyquist sampling theorem stating that the sampling rate needs to be at least twice the maximal signal frequency. In the discrete case, the number of measurements should be at least equal to the signal length in order to be exactly reconstructed. However, this approach may require large storage space, significant sensing time, heavy power consumption, and large number of sensors. Compressive sensing (CS) is a novel theory that goes beyond the traditional approach [1–4]. It shows that a sparse signal can be reconstructed from much fewer incoherent measurements. The basic assumption in CS approach is that most of the signals in real applications have a concise representation in a certain transform domain where only few of them are significant, while the rest are zero or negligible [5–7]. This requirement is defined as signal sparsity. Another important requirement is the incoherent nature of measurements (observations) in the signal acquisition domain. Therefore, the main objective of CS is to provide an estimate of the original signal from a small number of linear incoherent measurements by exploiting the sparsity property [3, 4].
The CS theory covers not only the signal acquisition strategy, but also the signal reconstruction possibilities and different algorithms [8–17]. Several approaches for CS signal reconstruction have been developed and most of them belong to one of three main approaches: convex optimizations [8–11] such as basis pursuit, Dantzig selector, and gradientbased algorithms; greedy algorithms like matching pursuit [14] and orthogonal matching pursuit [15]; and hybrid methods such as compressive sampling matching pursuit [16] and stagewise OMP [17]. When comparing these algorithms, convex programming provides the best reconstruction accuracy, but at the cost of high computational complexity. The greedy algorithms bring about low computation complexity, while the hybrid methods try to provide a compromise between these two requirements [18].
The proposed work provides a survey of the general compressive sensing concept supplemented with the several existing approaches and methods for signal reconstruction, which are briefly explained and summarized in the form of algorithms with the aim of providing the readers with an easier and practical insight into the state of the art in this field. Apart from the standard CS algorithms, a few recent solutions have been included as well. Furthermore, the paper provides an overview of different sparsity domains and the possibilities of employing them in the CS problem formulation. Additional contribution is provided through the examples showing the efficiency of the presented methods in practical applications.
The paper is organized as follows. In Section 2, a brief review of the general compressive sensing idea is provided together with the conditions for successful signal reconstruction from reduced set of measurements and the signal recovery formulations using minimization approaches. In Section 3, the commonly used CS algorithms are reviewed. The commonly used domains for CS strategy implementation are given in Section 4, while some of the examples in real applications are provided in Section 5. The concluding remarks are given in Section 6.
2. Compressive Sensing: A General Overview
2.1. Sparsity and Compressibility
Reducing the sampling rate using CS is possible for the case of sparse signals that can be represented by a small number of significant coefficients in an appropriate transform basis. A signal having K nonzero coefficients is called sparse. Assume that signal exhibits sparsity in certain orthonormal basis defined by the basis vectors . The signal can be represented using its sparse transform domain vector as follows:In matrix notation, the previous relation can be written asCommonly, the sparsity is measured using the norm, which represents the cardinality of the support of : In real applications, the signals are usually not strictly sparse but only approximately sparse. Therefore, instead of being sparse, these signals are often called compressible, meaning that the amplitudes of coefficients decrease rapidly when arranged in descending order. For instance, if we consider coefficients , then the magnitude decays with a power law if there exist constants and satisfying [19]where larger means faster decay and consequently more compressible signal. The signal compressibility can be quantified using the minimal error between the original and sparsified signal (obtained by keeping only largest coefficients):
2.2. Conditions on the CS Matrix: Null Space Property, Restricted Isometry Property, and Incoherence
Instead of acquiring a full set of signal samples of length , in CS scenario, we deal with a quite reduced set of measurements of length , where . The measurement procedure can be modeled by projections of the signal onto vectors constituting the measurement matrix :Using the sparse transform domain representation of vector s given by (2), we havewhere will be referred to as CS matrix.
In order to define some requirements for the CS matrix , which are important for successful signal reconstruction, let us introduce the null space of matrix. The null space of CS matrix contains all vectors that are mapped to 0:In order to provide a unique solution, it is necessary to provide the notion that two Ksparse vectors and do not result in the same measurement vector. In other words, their difference should not be part of the null space of CS matrix :Since the difference between two Ksparse vectors is at most 2Ksparse, then a Ksparse vector is uniquely defined if null space of contains no 2Ksparse vectors. This corresponds to the condition that any 2K columns of are linearly independent; that is, and since , we obtain a lower bound on the number of measurements:In the case of strictly sparse signals, the spark can provide reliable information about the exact reconstruction possibility. However, in the case of approximately sparse signals, this condition is not sufficient and does not guarantee stable recovery. Hence, there is another property called null space property that measures the concentration of the null space of matrix . The null space property is satisfied if there is a constant such that [19]for all sets with cardinality K and their complements . If the null space property is satisfied, then a strictly sparse signal can be perfectly reconstructed by using minimization. For approximately sparse signals, an upper bound of the minimization error can be defined as follows [20]:where is defined in (5) for as the minimal error induced by the best sparse approximation.
The null space property is necessary and sufficient for establishing guarantees for recovery. A stronger condition is required in the presence of noise (and approximately sparse signals). Hence, in [8], the restricted isometry property (RIP) of CS matrix has been introduced. The CS matrix satisfies the RIP property with constant iffor every sparse vector . This property shows how well the distances are preserved by a certain linear transformation. We might now say that if the RIP is satisfied for 2K with , then there are no two Ksparse vectors that can correspond to the same measurement vector .
Finally, the incoherence condition mentioned before, which is also related to the RIP of matrix , refers to the incoherence of the projection basis and the sparsifying basis . The mutual coherence can be simply defined by using the combined CS matrix as follows [21]:The mutual coherence is related to the restricted isometry constant using the following bound [22]:
2.3. Signal Recovery Using Minimization Approach
The signal recovery problem is defined as the reconstruction of vector from the measurements . This problem can be generally seen as a problem of solving an underdetermined set of linear equations. However, in the circumstances when is sparse, the problem can be reduced to the following minimization:The minimization requires an exhaustive search over all possible sparse combinations, which is computationally intractable. Hence, the minimization is replaced by convex minimization, which will provide the sparse result with high probability if the measurement matrix satisfies the previous conditions. The minimization problem is defined as follows:and it has been known as the basis pursuit.
In the situation when the measurements are corrupted by the noise of level and , the reconstruction problem can be defined in a form: called basis pursuit denoising. The error bound for the solution of (19), where A satisfies the RIP of order with and , is given bywhere the constants and are defined as [19] For a particular regularization parameter , the minimization problem (19) can be defined using the unconstrained version as follows:which is known as the Lagrangian form of the basis pursuit denoising. These algorithms are commonly solved using primaldual interiorpoint methods [22].
Another form of basis pursuit denoising is solved using the least absolute shrinkage and selection operator (LASSO), and it is defined as follows: where is a nonnegative real parameter. The convex optimization methods usually require high computational complexity and high numerical precision.
When the noise is unbounded, one may apply the convex program based on Dantzig selector (it is assumed that the noise variance is per measurement, i.e., the total variance is ): which (for enough measurements) reconstructs a signal with the error bound: The norm is infinite norm called also supreme (maximum) norm.
Besides the norm minimization, there exist some approaches using the norm minimization, with :or using norm minimization, in which case the solution is not rigorously sparse enough [23]:
3. Review of Some Signal Reconstruction Algorithms
The minimization problems in CS signal reconstruction are usually solved using the convex optimization methods. In addition, there exist greedy methods for sparse signal recovery which allow faster computation compared to minimization. Greedy algorithms can be divided into two major groups: greedy pursuit methods and thresholdingbased methods. In practical applications, the widely used ones are the orthogonal matching pursuit (OMP) and compressive sampling matching pursuit (CoSaMP) from the group of greedy pursuit methods, while from the thresholding group the iterative hard thresholding (IHT) is commonly used due to its simplicity, although it may not be always efficient in providing an exact solution. Some of these algorithms are discussed in detail in this section.
3.1. Matching Pursuit
The matching pursuit algorithm has been known for its simplicity and was first introduced in [14]. This is the first algorithm from the class of iterative greedy methods that decomposes a signal into a linear set of basis functions. Through the iterations, this algorithm chooses in a greedy manner the basis functions that best match the signal. Also, in each iteration, the algorithm removes the signal component having the form of the selected basis function and obtains the residual. This procedure is repeated until the norm of the residual becomes lower than a certain predefined threshold value (halting criterion) (Algorithm 1).

The matching pursuit algorithm however has a slow convergence property and generally does not provide efficiently sparse results.
3.2. Orthogonal Matching Pursuit
The orthogonal matching pursuit (OMP) has been introduced [15] as an improved version to overcome the limitations of the matching pursuit algorithm. OMP is based on principle of orthogonalization. It computes the inner product of the residue and the measurement matrix and then selects the index of the maximum correlation column and extracts this column (in each iteration). The extracted columns are included into the selected set of atoms. Then, the OMP performs orthogonal projection over the subspace of previously selected atoms, providing a new approximation vector used to update the residual. Here, the residual is always orthogonal to the columns of the CS matrix, so there will be no columns selected twice and the set of selected columns is increased through the iterations. OMP provides better asymptotic convergence compared to the previous matching pursuit version (Algorithm 2).

3.3. CoSaMP
Compressive sampling matching pursuit (CoSaMP) is an extended version of the OMP algorithm [16].
In each iteration, the residual vector is correlated with the columns of the CS matrix , forming a “proxy signal” . Then, the algorithm selects 2 columns of corresponding to the 2 largest absolute values of , where defines the signal sparsity (number of nonzero components). Namely, all but the largest 2 elements of are set to zero and the 2 support is obtained by finding the positions of nonzero elements. The indices of the selected columns (2 in total) are then added to the current estimate of the support of the unknown vector. After solving the least squares, a 3sparse estimate of the unknown vector is obtained. Then, the 3sparse vector is pruned to obtain the sparse estimate of the unknown vector (by setting all but the largest elements to zero). Thus, the estimate of the unknown vector remains sparse and all columns that do not correspond to the true signal components are removed, which is an improvement over the OMP. Namely, if OMP selects in some iteration a column that does not correspond to the true signal component, the index of this column will remain in the final signal support and cannot be removed (Algorithm 3).

The CoSaMP can be observed through five crucial steps:(i)Identification (Line 5). It finds the largest 2s components of the signal proxy.(ii)Support Merge (Line 6). It merges the support of the signal proxy with the support of the solution from the previous iteration.(iii)Estimation (Line 7). It estimates a solution via least squares where the solution lies within a support T.(iv)Pruning (Line 8). It takes the solution estimate and compresses it to the required support.(v)Sample Update (Line 9). It updates the “sample,” meaning the residual as the part of signal that has not been approximated.
3.4. Iterative Hard Thresholding Algorithm
Another group of algorithms for signal reconstruction from a small set of measurements is based on iterative thresholding. Generally, these algorithms are composed of two main steps. The first one is the optimization of the least squares term, which is done by solving the optimization problem without minimization. The other one is the decreasing of the norm, which is done by applying the thresholding operator to the magnitude of entries in . In each iteration, the sparse vector is estimated by the previous version using negative gradient of the objective function defined as while the negative gradient is thenGenerally, the obtained estimate is sparse, and we need to set all but the K largest components to zero using the thresholding operator. Here, we distinguish two types of thresholding. Hard thresholding [24, 25] sets all but the K largest magnitude values to zero, where the thresholding operator can be written aswhere is the K largest component of x. The algorithm is summarized in Algorithm 4.

The stopping criterion for IHT can be a fixed number of iterations or the algorithm terminates when the sparse vector does not change much between consecutive iterations.
The softthresholding function can be defined asand it is applied to each element of .
3.5. Iterative Soft Thresholding for Solving LASSO
Let us consider the minimization of the function: as given by the LASSO minimization problem (22). One of the algorithms that has been used for solving (32) is the iterated softthresholding algorithm (ISTA), also called the thresholded Landweber algorithm. In order to provide an iterative procedure for solving the considered minimization problem, we use the minimizationmaximization [26] approach to minimize . Hence, we should create a function that is equal to at ; otherwise, it upperbounds . Minimizing a majorizer is easier and avoids solving a system of equations. Hence [27], and α is such that the added function is nonnegative, meaning that must be equal to or greater than the maximum eigenvalue of : In order to minimize , the gradient of should be zero:The solution isThe solution of the linear equationis obtained using the softthresholding approach:Hence, we may write the softthresholdingbased solution of (36) as follows:In terms of iteration , the previous relation can be rewritten as
3.6. Automated Threshold Based Iterative Solution
This algorithm (see [12, 13]) starts from the assumption that the missing samples, modeled by zero values in the signal domain, produce noise in the transform domain. Namely, the zero values at the positions of missing samples can be observed as a result of adding noise to these samples, where the noise at the unavailable positions has values of original signal samples with opposite sign. For instance, let us observe the signals that are sparse in the Fourier transform domain. The variance of the mentioned noise when observed in the discrete Fourier transform (DFT) domain can be calculated as where M is the number of available and N is the total number of samples and is a measurement vector. Consequently, using the variance of noise, it is possible to define a threshold to separate signal components from the nonsignal components in the DFT domain.
For a desired probability , the threshold is derived as The automated threshold based algorithm, in each iteration, detects a certain number of DFT components above the threshold. A set of positions corresponding to the detected components is selected and the contribution of these components is removed from the signal’s DFT. This will reveal the remaining components that are below the noise level. Further, it is necessary to update the noise variance and threshold value for the new iteration. Since the algorithm detects a set of components in each iteration, it usually needs just a couple of iterations to recover the entire signal. In the case that all signal components are above the noise level in DFT, the component detection and reconstruction are achieved in single iteration.
3.7. Adaptive GradientBased Algorithm
This algorithm belongs to the group of convex minimization approaches [11]. Unlike other convex minimization approaches, this method starts from some initial values of unavailable samples (initial state) which are changed through the iterations in a way to constantly improve the concentration in sparsity domain. In general, it does not require the signal to be strictly sparse in certain transform domain, which is an advantage over other methods. Particularly, the missing samples in the signal domain can be considered as zero values. In each iteration, the missing samples are changed for and for . Then, for both changes, the concentration is measured as norm of the transform domain vectors (for change) and (for change), while the gradient is determined using their difference (lines 8, 9, and 10). Finally, the gradient is used to update the values of missing samples. Here, it is important to note that each sample is observed separately in this algorithm and one iteration is finished when all samples are processed.
When the algorithm reaches the vicinity of the sparsity measure minimum, the gradient changes direction for almost 180 degrees (line 16), meaning that the step needs to be decreased (line 17).
The precision of the result in this iterative algorithm is estimated based on the change of the result in the last iteration (lines 18 and 19).
4. Exploring Different Transform Domains for CS Signal Reconstruction
4.1. CS in the Standard Fourier Transform Domain
As the simplest case of CS scenario, we will assume a multicomponent signal that consists of K sinusoids that are sparse in the DFT domain:where the sparsity level is , where is total signal length, while and denote amplitudes and frequencies of signal components, respectively. Since s is sparse in the DFT domain, then we can writewhere is a vector of DFT coefficients where at most K coefficients are nonzero, while is the inverse Fourier transform matrix of size .
If is a signal in compressive sensing application, then only the random measurements are available and these are defined by the set of M positions . Therefore, the measurement process can be modeled by matrix :The CS matrix represents a partial random inverse Fourier transform matrix obtained by omitting rows from that corresponds to unavailable samples positions:with the Fourier basis functions: The CS problem formulation is now given as
4.2. CS in the Polynomial Fourier Transform Domain
In this section, we consider the possibility of CS reconstruction of polynomial phase signals [28]. Observe the multicomponent polynomial phase signal vector s, with elements s(n): where the polynomial coefficients are assumed to be bounded integers. The assumption is that the signal can be considered as sparse in the polynomial Fourier transform (PFT) domain. Hence, the discrete PFT form is given by If we choose a set of parameters that match the polynomial phase coefficients then the th signal component is demodulated and becomes a sinusoid: which is dominant in the PFT. In other words, the spectrum is highly concentrated at . Therefore, we might say that if (51) is satisfied then the PFT is compressible with the dominant th component. Note that the sparsity (compressibility) in the PFT domain is observed with respect to the single demodulated component.
In order to define the CS problem in the PFT domain, instead of the signal itself, we will consider a modified signal form obtained after the multiplication with the exponential term:given in the PFT definition (50). The new signal form is obtained asor in the vector form:
Then, the PFT definition given by (50) can be rewritten in the vector form as follows:where is the discrete Fourier transform matrix ().
For a chosen set of parameters in that is equal to the set in , is characterized by one dominant sinusoidal component at the frequency .
4.2.1. CS Scenario
Now, assume that is incomplete in the compressive sensing sense, and instead of we are dealing with available measurements defined by vector , and where is the partial random Fourier matrix obtained by omitting rows from that correspond to the unavailable samples. When , then X can be observed as a demodulated version of the th signal component , having the dominant th component in the spectrum with the support . The rest of the components in spectrum are much lower than and could be observed as noise. Hence, we may write the minimization problem in the formwhere T is certain threshold. The dominant components can be detected using the threshold based algorithm (Algorithm 5) described in the previous section. Using an iterative procedure, one may change the values of parameters between and , . The algorithm will detect the support when the set matches the set ; otherwise, no component support is detected. Hence, as a result of this phase, we have identified the sets of signal phase parameters: .

4.2.2. Reconstruction of Exact Components Amplitudes
The next phase is reconstruction of components amplitudes. Denote the set of measurements positions by , such that , while the detected sets of signal phase parameters are .
In order to calculate the exact amplitudes of signal components, we observe the set of equations in the formor in other words we have another system of equations given by where contains the desired signal amplitudes. The matrix of size is based on the PFT: The rows of correspond to positions of measurements , and columns correspond to phase parameters , for . The solution of the observed problem can be obtained in the least squares sense as follows:
The resulting reconstructed signal is obtained as
4.3. CS in the Hermite Transform Domain
The Hermite expansion using Hermite functions can be written in terms of the Hermite transform matrix. First, let us define the Hermite transform matrix (of size ) [29, 30]:Here, the th order Hermite basis function is defined using the th order Hermite polynomial as follows: where is the scaling factor used to “stretch” or “compress” Hermite functions, in order to match the signal.
The Hermite functions are usually calculated using the following fast recursive realization:The Hermite transform coefficients can be calculated in the matrix form as follows: where the vector of Hermite transform coefficients is , while the vector of signal samples is .
Following the GaussHermite approximation, the inverse matrix contains Hermite functions and it is given byNow, in the context of CS, let us assume that a signal is Ksparse in the Hermite transform domain, meaning that it can be represented by a small set of nonzero Hermite coefficients c: where is the th element from the vector of Hermite expansion coefficients. In the matrix form we may write such that most of the coefficients in are zero values. Furthermore, assume that only the compressive sensing is done using random selection of M signal values . Then, the CS matrix can be defined as random partial inverse Hermite transform matrix: For a vector of available measurements vector , the linear system of equations (undetermined system of M linear equations and unknowns) can be written asSince has only nonzero components, the reconstruction problem can be defined asIf we identify the signal support in the Hermite transform domain by the set of indices , then the problem can be solved in the least squares sense aswhere denotes the conjugate transpose operation, whileand . The estimated signal can be now obtained according to (68).
4.4. CS in the TimeFrequency Domain
Starting from the definition of the shorttime Fourier transformwe can define the following STFT vector and signal vector: respectively. The STFT at an instant n can be now rewritten in the matrix form:where is the standard Fourier transform matrix of size with elementsIf we consider the nonoverlapping signal segments for the STFT calculation, then takes the values from the set , where is the window length while is the total signal length. The STFT vector, for , is given byConsequently, (77) can be extended to the entire STFT for all considered values of as follows [31, 32]:or in the compact formIn the case of timevarying windows with lengths , the previous system of equations becomesNote that the signal vector is and can be expressed using the point inverse Fourier transform as follows:where X is a sparse vector of DFT coefficients. By combining (81) and (83), we obtain [31]Furthermore, if we assume that STFT vector is not completely available but has only a small percent of available samples, then we are dealing with the CS problem in the formwhere is a vector of available measurements from STFT, while is the CS matrix obtained as partial combined transform matrix with rows corresponding to the available samples . In order to reconstruct the entire STFT, we can define the minimization problem in the formwhich can be solved using any of the algorithms provided in the previous section.
5. Experimental Evaluation
Example 1. Let us observe the QRS complex extracted from the real ECG signal. It has been known that these types of signals are sparse in the Hermite transform (HT) domain [33, 34]. As an illustration, we present the Hermite transform and discrete Fourier transform (DFT) in Figure 1, from which it can be observed that HT provides sparse representation while DFT is dense. The analyzed signal is obtained from the MITECG Compression Test Database [35], originally sampled uniformly with being the sampling period, with amplitude gain 1/400. Extracted QRS is of length , centered at the peak. Note that, in order to provide the sparsest HT representation, the scaling factor should be set to the optimal value [33, 34], which is 6.8038 in this example, that is, in seconds, proportional to the scaling factor 0.017 presented in [34] for signal with 27 samples, when it is scaled with the signal lengths ratio. Also, the QRS signal is resampled at the zeros of Hermite polynomial using sinc interpolation functions as described in [33].
The QRS signal is subject to compressed sensing approach, meaning that it is represented by 50% of available measurements. The available measurements and the corresponding HT (which is not sparse in the CS conditions) are shown in Figure 2.
In order to reconstruct the entire QRS signal from available measurements, two of the presented algorithms are employed, namely, the OMP and gradientbased reconstruction algorithm.
The reconstruction results are shown in Figure 3 in both time and transform domain. It can be observed that both algorithms provide quite successful reconstruction performance, which is certainly better in the case of gradientbased approach. In order to measure and compare the reconstruction performance of these two algorithms, the mean square error (MSE) is calculated, showing that the gradientbased algorithm in the considered case provides MSE = 13.85, much lower than MSE = 31.64 for the OMP algorithm.
(a)
(b)
(a)
(b)
(a)
(b)
(c)
(d)
Example 2. In this example, the reconstruction of images using adaptive gradientbased algorithm (Algorithm 6) is considered. The images