Manifold-Based Image Understanding

This method supports the entire image processing chain, from data acquisition through processing, and subsequent archiving and data mining.

The rapid growth of sensing and imaging technology, combined with the need for near-real-time action based on the sensed data, has rendered automatic processing, understanding, and decision-making vital to our national security. A unified theory and practical toolset was developed for the analysis and processing of signal and image manifolds for signal and image understanding purposes. The unifying theme is the multiscale geometric structure of signal and image families and manifolds. Specifically, theory and tools were developed for (1) model-based signal and image recognition and registration, (2) sensing and compressing data on manifolds, and (3) data-driven manifold modeling and learning.

  1. Model-based signal recognition and registration: It was determined how to best understand and infer signal and image information based on prior models for potential targets. The result was the smashed filter, a new tool for compressive classification and recognition.
  2. Sensing and compressing data on manifolds: Efficient sampling and measurement schemes were developed for manifold-modeled data. The result was in proving the sufficiency of random projections to compressively capture signals on a manifold, and applying this result to the theory of compressive sensing.
  3. Data-driven manifold modeling and learning: To bridge the gap to practice, a new theory and algorithms were developed for learning manifold models for signal and image families. Applications of these ideas include adaptation to novel targets, signal and image database mining, and compression.

The Single-Pixel Compressive Imaging Camera. Incident lightfield (corresponding to the desired image x) is reflected off a digital micromirror device (DMD) array whose mirror orientations are modulated in the pseudorandom pattern Øm supplied by a random number generator. Each different mirror pattern produces a voltage at the single photodiode that corresponds to one measurement Ym.
Compressive sensing (CS) enables the reconstruction of a sparse or compressible image or signal from a small set of linear, non-adaptive (even random) projections. However, in many applications, including object and target recognition, one is interested in making a decision about an image rather than computing a reconstruction. A framework was proposed for compressive classification that operates directly on the compressive measurements without first reconstructing the image. The resulting dimensionally reduced matched filter is called the smashed filter. The first part of the theory maps traditional maximum likelihood hypothesis testing into the compressive domain; it was found that the number of measurements required for a given classification performance level does not depend on the sparsity or compressibility of the images but only on the noise level.

The second part of the theory applies the generalized maximum likelihood method to deal with unknown transformations such as the translation, scale, or viewing angle of a target object. The fact that the set of transformed images forms a low-dimensional, nonlinear manifold in the high-dimensional image space was exploited. The number of measurements required for a given classification performance level grows linearly in the dimensionality of the manifold, but only logarithmically in the number of pixels/samples and image classes. Using both simulations and measurements from a new, single-pixel compressive camera, the effectiveness of the smashed filter for target classification, using very few measurements, was demonstrated.

In addition to the computational and storage savings afforded by compressive classification, the proposed method shares many advantages previously shown for CS reconstruction. In particular, random projections enable universal estimation and classification in the sense that random projections preserve the structure of any low-dimensional signal class with high probability. In this context, it means that one need not know what the classes are or what the classification algorithm will be prior to acquiring the measurements. Additionally, compressive measurements are progressive in the sense that larger numbers of projections translate into higher classification rates due to increased noise tolerance, and democratic in that each measurement can be given equal priority because classification rates depend only on how many measurements are received, not on the particular subset received.

It was demonstrated that, thanks to the pronounced structure present in many signal classes, small numbers of non-adaptive compressive measurements can suffice to capture the relevant information required for accurate classification. Simple parametric models impose a low-dimensional manifold structure on the signal classes within the high-dimensional image space, and the geometric structure of these manifolds is preserved under their projection to a random, lower-dimensional subspace.

The GMLC-based smashed filter is readily implementable with CS hardware, such as the single-pixel compressive imaging camera, and shares many of the attractive features of CS in general, including simplicity, universality, robustness, democracy, and scalability, which should enable it to impact a variety of different applications. More sophisticated algorithms could exploit the manifold structure to more efficiently obtain the ML estimates required by the smashed filter. For example, rather than an exhaustive nearest-neighbor search, which could be computationally prohibitive for a large training set, a greedy approach might offer similar performance at significant computational savings; other approaches that exploit the smoothness of the manifolds could also be beneficial.

This work was done by Richard G. Baraniuk of Rice University for the Office of Naval Research. For more information, download the Technical Support Package (free white paper) at www.defensetechbriefs.com/tsp  under the Physical Sciences category. ONR-0020



This Brief includes a Technical Support Package (TSP).
Document cover
Manifold-Based Image Understanding

(reference ONR-0020) is currently available for download from the TSP library.

Don't have an account? Sign up here.