A research effort now underway addresses fundamental mathematical issues involved in a methodology of creating flexible machine vision systems that would be able to modify their behaviors and evolve in particular environments so as to recognize anything that human operators have designated as being "interesting" in those environments. It is intended that a person who is not a programmer could train such a machine vision system by drawing lines around objects in a scene (see figure) or otherwise indicating example objects and that thereafter, the system would adapt and evolve the ability to recognize such objects automatically.
Prior approaches to machine vision include algorithmic knowledge-based programming, neural-network systems that learn from examples, and combinations of both. Many practical systems are based on algorithms or procedures that make opportunistic use of special features characteristic of the particular objects and scenes required to be recognized. Although these prior approaches and systems may afford acceptable performances in specific applications, there is usually little or no ability to generalize to other, similar applications and there is no ability to generalize to different environments. For example, a machine vision system designed to inspect industrial parts is unlikely to be useful for reading car license plates.
What is sought in this research is a machine vision architecture that supports "point-and-learn" training, works for cluttered scenes, enables adaptation to changes in objects and scenes, and enables adaptation to any scene or environment. In broad terms, it has been proposed, as a basis of this research, that this be a multilevel architecture in which:
- Machine vision systems would evolve appropriate retinal configurations, evolve connectivities to represent spatial relationships, and abstract information to generate their own higherlevel constructs; and
- Levels would be integrated by new relational mathematics as summarized below.
The key feature of the architecture is the ability of a machine vision system to abstract its own constructs from data in a multilevel algebraic representation. This feature enables the system to learn objects that may change through time and to generalize and adapt to learn radically new objects and scenes without the need to change the underlying computer program. The computational demands imposed by the requirement to learn and adapt in this way are beyond the capabilities of any current machine vision systems.
In this research, the approach followed in attempting to satisfy the requirement to learn and adapt is based on the mathematics of multilevel hypernetworks, in which network theory is generalized to multilevel, multidimensional space. Hypernetworks naturally give rise to multilevel systems and provide the essential structural architecture for self-adapting machine vision systems. The fundamental architecture proposed is based on hypernetworks that are inherently relational but can integrate objects with geometrical properties. The hypernetwork representation accommodates relational and numerical data, and supports geometrical operations implemented at high levels of abstraction.
Thus far, the research has led to the identification of both neuronal and algorithmic data processing as being appropriate to an automatic machine vision system having the required capabilities, and to the determination that both types of processing can be integrated coherently. A machine vision system based on the proposed architecture would operate initially in bottom-up fashion, processing from pixels to features. Then intermediate and high-level operators would analyze the geometric and topological properties of the emergent features and would play a role in a dynamic bottom- up/top-down process.
This work was done by Masanori Sugisaka of Oita University for the Air Force Research Laboratory. For further information, download the free white paper at www.defensetechbriefs.com under the Information Sciences category. AFRL-0016
This Brief includes a Technical Support Package (TSP).
Automatic Abstraction of Information From Digitized Images
(reference AFRL-0016) is currently available for download from the TSP library.
Don't have an account? Sign up here.