Amethod of real-time fusion of readout data from electronic inertial and image sensors for passive navigation has been developed. By "passive navigation" is meant navigation without the help of radar signals, Lidar signals, Global Positioning System (GPS) signals, or any other signals generated by on-board or external equipment. The concept of fusion of image- and inertial- sensor data for passive navigation is inspired by biological examples, including those of bees, migratory birds, and humans, all of which utilize inertial and imaging sensory modalities to pick out landmarks and navigate from landmark to landmark with relative ease. The present method is suitable for use in a variety of environments, including urban canyons and interiors of buildings, where GPS signals and other navigation signals are often unavailable or corrupted.

When used separately, imaging and inertial sensors have drawbacks that can result in poor navigation performance. A navigation system that uses inertial sensors alone relies upon dead reckoning, which is susceptible to drift over time. Image-sensing navigation sensors are susceptible to difficulty in identifying and matching good landmarks for navigation. The reason for fusing inertial- and image-sensor data is simply that they complement each other, making it possible to partly overcome the drawbacks of each type of sensor to obtain navigation results better than can be obtained from either type of sensor used alone.

Feature-Extraction Times as functions of image resolution, measured in tests, were found to be much less for a GPU-accelerated algorithm according to the present method than for an older CPU-based algorithm.
Prior designs of image-aided inertial navigation systems have represented compromises between computing power demand and performance. Some have used a simplified image processing algorithm or a priori navigation information, while others have simply post-processed navigation data. Such designs are not robust enough for use in autonomous navigation systems or as viable alternatives to GPS-based designs. In contrast, a navigation system based on the present method can achieve real-time performance using a complex image-processing algorithm that can work in a wide variety of environments.

The present method is a successor to a prior method based on a rigorous theory of fusion of image- and inertial-sensor data for precise navigation. The theory involves utilization of inertial-sensor data in dead-reckoning calculations to predict locations, in subsequent images, of features identified in previous images to within a given level of statistical uncertainty. Such prediction reduces the computational burden by limiting, to a size reflecting the statistical uncertainty, the feature space that must be searched in order to match features in successive images. When this prior method was implemented in a navigation system operating in an indoor environment, the performance of the system was comparable to the performances of GPS-aided systems.

In the present method, the fusion of data is effected by an extended Kalman filter. To improve feature-tracking performance, a previously developed robust feature- transformation algorithm denoted the scale-invariant feature transform (SIFT) is used. The SIFT features are ideal for navigation applications because they are invariant to scale, rotation, and illumination. Unfortunately, there exists a correlation between complexity of features and computer processing time. Heretofore, this correlation has limited the effectiveness of SIFT-based and other robust feature-extraction algorithms for real-time applications using traditional microprocessor architectures. Despite recent advances in computer technology, the amount of information that can be processed by a computer is still limited by limitations on the power and speed of the central processing unit (CPU) of the computer.

The present method is based partly on a theory that exploits the highly parallel nature of general programmable graphical processing units (GPGPU) in such a manner as to support deep integration of optical and inertial sensors for realtime navigation. The method leverages the existing OpenVIDIA core GPGPU software library and commercially available computer hardware to effect fusion of image- and inertial-sensor data. [OpenVIDIA is a programming framework, originally for computer vision applications, embodied in a software library and an application programming interface, that utilizes multiple graphics cards (GPUs) present in many modern computers to implement parallel computing and thereby obtain computational speed much greater than that of a CPU alone.] In this method, the OpenVIDIA library is extended to include the statistical feature- projection and feature-matching techniques of the predecessor datafusion method.

In an experimental system based on this method, data from inertial and image sensors were integrated on a commercially available laptop computer containing a programmable GPU. In application to experimental data collections, feature-processing speeds were found to be increased by factors of as much as 30 over those attainable by use of an equivalent CPU-based algorithm (see figure). Frame rates >10 Hz, suitable for navigation, were demonstrated. The navigation performance of this system was shown to be identical to that of an otherwise equivalent system, based on the predecessor method, that required lengthy postprocessing.

This work was done by J. Fletcher, M. Veth, and J. Raquet of the Air Force Institute of Technology for the Air Force Research Laboratory.

This Brief includes a Technical Support Package (TSP).
Fusion of Image- and Inertial-Sensor Data for Navigation

(reference AFRL-0084) is currently available for download from the TSP library.

Don't have an account? Sign up here.