On detection and representation of multiscale low-level image structure
暂无分享,去创建一个
The objective of computer vision is interpretation of visual images. Any datainterpretation task of such magnitude requires models of the data. For example, in speech the audio signal is parsed into phonemes, which are successively merged into increasingly complex units and eventually into an interpretation, often with feedback from higher levels. Another example is hierarchical interpretation of computer programs in a given language through the use of grammars. In image data, analogues of phonemes and characters correspond to structural primitives that compress the data to a manageable size without eliminating any possible final interpretations. Because images are significantly larger and more complex than speech signals, a capability for initial, bottom-up data reduction is even more critical. The lowlevel structure would serve as a lossless image abstraction and help initiate hierarchical, closed-loop image interpretation, for example, for recognition by enforcing a priori semantic constraints involving part–whole relationships. This note is not concerned with interpretation processes, it describes some desirable characteristics of strategies for the detection and representation of low-level perceptual structure or multiscale segmentation, which remains an open problem. Homogeneous image regions may be used as structural primitives. A region can be characterized as possessing a certain degree of interior homogeneity and a contrast with the surround that is large compared to the interior variation. This is a satisfactory characterization from both the perceptual and quantitative viewpoints. Past work on image segmentation has not yielded acceptable algorithms. This is due to the following main challenges. First, the type of region homogeneity and the magnitude of the contrast may vary and the regions may have arbitrary size and shape. Although a region can be detected by identifying either its interior or its border, the latter method has been more thoroughly investigated. These methods use different models of border geometry (e.g., straightness), and brightness variation along borders (e.g., linearity), across borders, and within regions. Most methods are linear. Although such models and methods simplify processing, they lead to fundamental limitations in the detection accuracy and the sensitivity of the resulting segmentation. The second reason involves the multiscale nature of Image structure, that M, geometric and photometric sensitivity to detail, A pixel may belong simultaneously to different regions, each having a different contrast value (photometric