The ability to detect and recognize buildings is important to a variety of vision applications operating in outdoor urban environments. These include landmark recognition, assisted and autonomous navigation, image-based rendering, and 3D scene modeling. The problem of detecting multiple planar surfaces from a single image has been solved with this technology.

As any given image can be generated by an infinite number of 3D surfaces, when only a single image is available, some assumptions about the geometric properties of the scene must be made in order to recover the surface geometry. Most urban building facades have surface markings due to doors, windows, bricks, and blocks. As such, each building facade generally consists of two sets of parallel lines, where lines in the first set intersect lines in the second set at right angles. It is well known that the perspective image of a collection of parallel scene edges intersects at a single point in the image, known as the vanishing point. Thus, the image of a building facade may be identified by locating regions in the image covered by pairs of intersecting edges, where each edge is oriented in the direction of one of two vanishing points.Image line segments are first located, and then the vanishing points of these segments are determined. Groups of short segments are combined into longer segments while maintaining alignment with the associated vanishing points. Next, the intersections of line segments associated with pairs of vanishing points are used to generate local support for planar facades at different orientations. The plane support points are then clustered using an algorithm that requires no knowledge of the number of clusters or of their spatial proximity. Finally, building facades are identified by fitting vanishing-point-aligned quadrilaterals to the clustered support points. The main contribution of this approach is its improved performance over existing approaches while placing no constraints on the facades in terms of their number or orientation, and minimal constraints on the length of the detected line segments.

Image line segments that have been labeled according to vanishing point provide an initial cue to segmenting planar regions in the image. Under the assumption that intersecting edges in the scene are coplanar and orthogonal, every pair of nearby, nonparallel, vanishing point-aligned image line segments defines the local surface orientation of the scene point that projects to the segment intersection point in the image. For two local image regions to be images of the same plane, the pairs of intersecting line segments in each of the two regions should be labeled with the same two vanishing points. It was determined to cluster pairs of intersecting line segments that have identical vanishing point label pairs.