next up previous contents
Next: Categorization Up: Recognition Previous: Similarity within and

Recognition as function approximation

 

We have seen that a pattern of activities of RFs can be made to reflect properties of the viewed object, discounting to a certain extent factors that are irrelevant to recognition. To carry out the recognition step itself, the RF activities have to be combined into a decision criterion. The goal of this stage may be, for example, to compute for each object known to the system a number between 0 and 1 that would correspond to the system's confidence as to its presence in the input. A general approach to this problem, valid in vision as well as in other domains, is to apply a standard technique for learning from examples, or, equivalently, function approximation [Poggio, 1990]. A method that is particularly suitable in the present context is approximation by radial basis functions (RBFs).

The computational reason for the feasibility of this approach is basically the smoothness of the manifold formed by the different views of the same object in the space of views of all possible objects [Poggio and Edelman, 1990].gif An RBF approximation module effectively constructs the manifold by computing its ``height'' over the input measurement space as a linear combination of the contributions of the data points (see Figure 4). The contributions are determined by placing a kernel (that is, a basis function) at selected points [IMAGE ], so that

 [IMAGE ]

and by computing the weights [IMAGE ] that minimize the approximation error [IMAGE ] accumulated over all the data [IMAGE ]. A good choice for the shape of the kernel [IMAGE ] is the Gaussian [IMAGE ], because of the universal approximation properties of linear superpositions of Gaussians [Hartman et al., 1990], because it can be derived from a regularized solution to the approximation problem, as well as for other reasons [Poggio and Girosi, 1990]. The Gaussian kernel is especially relevant in the context of visual modeling, because it makes it possible to interpret equation 7 as a linear combination of products of activities of 2D image-based Gaussian RFs. In other words, 2D RFs can be combined multiplicatively to form the multidimensional Gaussians that serve as the basis functions in the expansion [Poggio and Edelman, 1990].

  [IMAGE ]
Figure: Standard techniques for function approximation can be used to construct a characteristic function for a given object from a collection of its views (see section 3). Here, radial basis function approximation in the space of all views of an object is carried out by forming a weighted sum of responses of RFs tuned to some of the views. The graded response of the resulting module defines a RF in the shape space (the response grows with increased similarity between the input and the object on which the module has been trained). This property of the RBF recognizer is used in section 4 to construct a categorization mechanism for novel objects.

  [IMAGE ]
Figure: A strange object (a cameleopard, left) and two more familiar ones (center, right). See section 4.



next up previous contents
Next: Categorization Up: Recognition Previous: Similarity within and



Edelman Shimon
Tue Nov 28 13:24:55 IST 1995