## Table of Contents

## Getting more from less

When you take a photo with your camera, the raw image data may take up tens of megabytes, yet once compressed, it can shrink to one-tenth of the size without any perceptible degradation of the image. In other words, the 50 MB of image *data* only contains 5 MB of *information*. This begs the question: can we directly measure the 5 MB of information instead? It turns out that you *can* do this using what is known as compressive (or compressed) sensing. These methods enable the reconstruction of a complete data set from measurements of a limited subset of the data.

In a mathematical sense, compressive sensing seeks to solve an underdetermined linear system where there are fewer equations (e.g., measured data points) than unknowns (e.g., all desired data points) using a minimum of prior knowledge regarding the data.
Compressive sensing combines concepts for data compression and statistical sparsity to achieve this. For example, one can compress an image (represented by a matrix) by applying to it a transform which concentrates most of the image intensity into a few elements and then throwing out the remaining elements that have zero (or nearly zero) intensity. The product of a such a transform, where most elements are zero, is called a **sparse matrix**. A transform that generates a sparse matrix for a particular data set is therefore called a **sparsifying transform**. The key is that we know which transform(s) will sparsify a specific type of data. In the case of images, the same set of transforms (discrete cosine transform, wavelet transform, etc) tends to work well for all types of images. To use this strategy for compressive sensing, we must first know in advance which transform(s) will sparsify our data.

## The compressive sensing method

The general procedure for using compressive sensing goes as follows. First, we measure a subset (specified by observation matrix $\mathbf A$) of the complete data set. We then apply the inverse of the sparsifying transform to $\mathbf A$. Next, we use linear algebra methods to solve for the complete data set using the measured data points. This yields the data set, but in a transformed state due to the sparsifying transform applied to $\mathbf A$. Finally, we apply the sparsifying transform to recover the untransformed data.

Mathematically, the goal is to minimize the L_{1}-norm of $\mathbf T \mathbf x$ subject to the constraint
\begin{equation}\label{eq:Axy}
\mathbf{A} \mathbf x = \mathbf y
\end{equation}
where $\mathbf A$ is the observation (or sampling) matrix, $\mathbf x$ is the vector of unknowns, $\mathbf T$ is a linear transform that sparsifies $\mathbf x$, and $\mathbf y$ is the vector of measurements.
Now, our problem can be restated as
\begin{equation}\label{eq:ATTxy}
\mathbf A \mathbf{T^{-1}} \mathbf T \mathbf x = \mathbf y
\end{equation}
which is equivalent to equation \ref{eq:Axy} since $\mathbf{T^{-1}} \mathbf T = \mathbf I$.
If we define $\mathbf B \equiv \mathbf A \mathbf{T^{-1}}$ and $\mathbf z \equiv \mathbf T \mathbf x$ and substitute into equation \ref{eq:ATTxy}, we get
\begin{equation}
\mathbf{B} \mathbf z = \mathbf y
\end{equation}
which can be solved for $\mathbf z$ using constrained L_{1} minimization (i.e., basis pursuit). Finally, we apply the sparsifying transform to yield the unknown vector:
\begin{equation}
\mathbf x = \mathbf{T^{-1}} \mathbf z
\end{equation}