论文信息 - Modeling second-order boundary perception: A machine learning approach

Modeling second-order boundary perception: A machine learning approach

Background: Visual pattern detection and discrimination are essential first steps for scene analysis. Numerous human psychophysical studies have modeled visual pattern detection and discrimination by estimating linear templates for classifying noisy stimuli defined by spatial variations in pixel intensities. However, such methods are poorly suited to understanding sensory processing mechanisms for complex visual stimuli such as second-order boundaries defined by spatial differences in contrast or texture. Methodology / Principal Findings: We introduce a novel machine learning framework for modeling human perception of second-order visual stimuli, using image-computable hierarchical neural network models fit directly to psychophysical trial data. This framework is applied to modeling visual processing of boundaries defined by differences in the contrast of a carrier texture pattern, in two different psychophysical tasks: (1) boundary orientation identification, and (2) fine orientation discrimination. Cross-validation analysis is employed to optimize model hyper-parameters, and demonstrate that these models are able to accurately predict human performance on novel stimulus sets not used for fitting model parameters. We find that, like the ideal observer, human observers take a region-based approach to the orientation identification task, while taking an edge-based approach to the fine orientation discrimination task. How observers integrate contrast modulation across orientation channels is investigated by fitting psychophysical data with two models representing competing hypotheses, revealing a preference for a model which combines multiple orientations at the earliest possible stage. Our results suggest that this machine learning approach has much potential to advance the study of second-order visual processing, and we outline future steps towards generalizing the method to modeling visual segmentation of natural texture boundaries. Conclusions / Significance: This study demonstrates how machine learning methodology can be fruitfully applied to psychophysical studies of second-order visual processing. Author Summary Many naturally occurring visual boundaries are defined by spatial differences in features other than luminance, for example by differences in texture or contrast. Quantitative models of such “second-order” boundary perception cannot be estimated using the standard regression techniques (known as “classification images”) commonly applied to “first-order”, luminance-defined stimuli. Here we present a novel machine learning approach to modeling second-order boundary perception using hierarchical neural networks. In contrast to previous quantitative studies of second-order boundary perception, we directly estimate network model parameters using psychophysical trial data. We demonstrate that our method can reveal different spatial summation strategies that human observers utilize for different kinds of second-order boundary perception tasks, and can be used to compare competing hypotheses of how contrast modulation is integrated across orientation channels. We outline extensions of the methodology to other kinds of second-order boundaries, including those in natural images.

Christopher DiMattina | Curtis L. Baker

[1] Anthony Hayes,et al. Mechanism independence for texture-modulation detection is consistent with a filter-rectify-filter mechanism , 2003, Visual Neuroscience.

[2] Shih-Wei Wu,et al. Limits to human movement planning in tasks with asymmetric gain landscapes. , 2006, Journal of vision.

[3] Gouki Okazawa,et al. Gradual Development of Visual Texture-Selective Properties Between Macaque Areas V2 and V4 , 2016, Cerebral cortex.

[4] I. Ohzawa,et al. Surround suppression of V1 neurons mediates orientation-based representation of high-order visual features. , 2009, Journal of neurophysiology.

[5] Darragh Smyth,et al. Methods for first-order kernel estimation: simple-cell receptive fields from responses to natural scenes , 2003, Network.

[6] J. Gallant,et al. Complete functional characterization of sensory neurons by system identification. , 2006, Annual review of neuroscience.

[7] B. Willmore,et al. Neural Representation of Natural Images in Visual Area V2 , 2010, The Journal of Neuroscience.

[8] Miguel P Eckstein,et al. Classification image analysis: estimation and statistical inference for two-alternative forced-choice experiments. , 2002, Journal of vision.

[9] Nikolaus Kriegeskorte,et al. Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[10] Simon Barthelmé,et al. Improved classification images with sparse priors in a smooth basis. , 2009, Journal of vision.

[11] C. Baker. Central neural mechanisms for detecting second-order motion , 1999, Current Opinion in Neurobiology.

[12] Tim S. Meese,et al. Measuring the spatial extent of texture pooling using reverse correlation , 2014, Vision Research.

[13] M. Georgeson,et al. Sensitivity to contrast modulation: the spatial frequency dependence of second-order vision , 2003, Vision Research.

[14] H. Komatsu,et al. Influence of the Direction of Elemental Luminance Gradients on the Responses of V4 Cells to Textured Surfaces , 2001, The Journal of Neuroscience.

[15] J. DiCarlo,et al. Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[16] D. Levi,et al. Receptive versus perceptive fields from the reverse-correlation viewpoint , 2006, Vision Research.

[17] Eero P. Simoncelli,et al. Selectivity and tolerance for visual texture in macaque V2 , 2016, Proceedings of the National Academy of Sciences.

[18] E H Adelson,et al. Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[19] D. Hubel,et al. Complex–unoriented cells in a subregion of primate area 18 , 1985, Nature.

[20] Terry Caelli,et al. On the discrimination of compound Gabor signals and textures , 1988, Vision Research.

[21] Amol Gharat,et al. Nonlinear Y-Like Receptive Fields in the Early Visual Cortex: An Intermediate Stage for Building Cue-Invariant Receptive Fields from Subcortical Y Cells , 2017, The Journal of Neuroscience.

[22] M. Landy,et al. Properties of second-order spatial frequency channels , 2002, Vision Research.

[23] P Perona,et al. Preattentive texture discrimination with early vision mechanisms. , 1990, Journal of the Optical Society of America. A, Optics and image science.

[24] Wilson S. Geisler,et al. Multichannel Texture Analysis Using Localized Spatial Filters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[25] H R Wilson. Nonlinear processes in visual pattern discrimination. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[26] Isamu Motoyoshi,et al. Differential roles of contrast polarity reveal two streams of second-order visual processing , 2007, Vision Research.

[27] David H Brainard,et al. The effect of photometric and geometric context on photometric and geometric lightness effects. , 2014, Journal of vision.

[28] Alan L. Yuille,et al. Statistical Edge Detection: Learning and Evaluating Edge Cues , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[29] Eero P. Simoncelli,et al. A functional and perceptual signature of the second visual area in primates , 2013, Nature Neuroscience.

[30] C. Baker,et al. Higher order texture statistics impair contrast boundary segmentation. , 2011, Journal of vision.

[31] Barry Giesbrecht,et al. Inter-trial switches in perceptual load modulate semantic processing during the attentional blink , 2010 .

[32] Jitendra Malik,et al. Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33] E J Chichilnisky,et al. A simple white noise analysis of neuronal light responses , 2001, Network.

[34] I. Ohzawa,et al. Spatiotemporal organization of simple-cell receptive fields in the cat's striate cortex. II. Linearity of temporal and spatial summation. , 1993, Journal of neurophysiology.

[35] J. Beck,et al. Contrast and spatial variables in texture segregation: Testing a simple spatial-frequency channels model , 1989, Perception & psychophysics.

[36] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[37] Peter Neri,et al. Estimation of internal noise using double passes: Does it matter how the second pass is delivered? , 2012, Vision Research.

[38] H. Komatsu,et al. Image statistics underlying natural texture selectivity of neurons in macaque V4 , 2014, Proceedings of the National Academy of Sciences.

[39] Eero P. Simoncelli,et al. A Convolutional Subunit Model for Neuronal Responses in Macaque V1 , 2015, The Journal of Neuroscience.

[40] Curtis L Baker,et al. Natural versus Synthetic Stimuli for Estimating Receptive Field Models: A Comparison of Predictive Robustness , 2012, The Journal of Neuroscience.

[41] Kechen Zhang,et al. How to Modify a Neural Network Gradually Without Changing Its Input-Output Functionality , 2010, Neural Computation.

[42] J. Bergen,et al. Computational Modeling of Visual Texture Segregation , 1991 .

[43] Ha Hong,et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[44] Hiroki Tanaka,et al. Neural Basis for Stereopsis from Second-Order Contrast Cues , 2006, The Journal of Neuroscience.

[45] L. Chalupa,et al. The visual neurosciences , 2004 .

[46] D. Ringach,et al. On the classification of simple and complex cells , 2002, Vision Research.

[47] A. Ahumada. Classification image weights and internal noise level estimation. , 2002, Journal of vision.

[48] Michael S. Landy,et al. Computational models of visual processing , 1991 .

[49] J. Hegdé,et al. Selectivity for Complex Shapes in Primate Visual Area V2 , 2000, The Journal of Neuroscience.

[50] Marc Pomplun,et al. Distorted object perception following whole-field adaptation of saccadic eye movements. , 2011, Journal of vision.

[51] Harriet A. Allen,et al. Second-order spatial frequency and orientation channels in human vision , 2006, Vision Research.

[52] I. Motoyoshi,et al. Cross-orientation summation in texture segregation , 2004, Vision Research.

[53] S. S. Wolfson,et al. PII: S0042-6989(97)00153-3 , 2003 .

[54] Norma Graham,et al. Nonlinear processes in spatial-frequency channel models of perceived texture segregation: Effects of sign and amount of contrast , 1992, Vision Research.

[55] Bruce C Hansen,et al. The role of spatial phase in texture segmentation and contour integration. , 2006, Journal of vision.

[56] Nikolaus Kriegeskorte,et al. Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[57] Of icebergs and spike codes: Titanic theories? , 2002 .

[58] Frederick A A Kingdom,et al. Spatiochromatic statistics of natural scenes: first- and second-order information and their correlational structure. , 2005, Journal of the Optical Society of America. A, Optics, image science, and vision.

[59] A. Ahumada. Perceptual Classification Images from Vernier Acuity Masked by Noise , 1996 .

[60] Mijung Park,et al. Receptive Field Inference with Localized Priors , 2011, PLoS Comput. Biol..

[61] H. Nothdurft. Sensitivity for structure gradient in texture discrimination tasks , 1985, Vision Research.

[62] Curtis L Baker,et al. Form-Cue Invariant Second-Order Neuronal Responses to Contrast Modulation in Primate Area V2 , 2014, The Journal of Neuroscience.

[63] R. Gregory. The Most Expensive Painting in the World , 2007, Perception.

[64] R. Allard,et al. Double dissociation between first- and second-order processing , 2007, Vision Research.

[65] Josh H. McDermott,et al. Psychophysics with junctions in real images. , 2010, Perception.

[66] W. Einhäuser,et al. Animal detection and identification in natural scenes: image statistics and emotional valence. , 2012, Journal of vision.

[67] Andrew J. King,et al. Network Receptive Field Modeling Reveals Extensive Integration and Multi-feature Selectivity in Auditory Cortical Neurons , 2016, PLoS Comput. Biol..

[68] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[69] D. C. Essen,et al. Neurons in monkey visual area V2 encode combinations of orientations , 2007, Nature Neuroscience.

[70] R. Gurnsey,et al. Texture discrimination with and without abrupt texture gradients. , 1992, Canadian journal of psychology.

[71] Tatyana O. Sharpee,et al. Cross-orientation suppression in visual area V2 , 2017, Nature Communications.

[72] R. Shapley,et al. Orientation Selectivity in Macaque V1: Diversity and Laminar Dependence , 2002, The Journal of Neuroscience.

[73] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[74] F. Kingdom,et al. Orientation opponency in human vision revealed by energy-frequency analysis , 2003, Vision Research.

[75] Curtis L Baker,et al. Higher order image structure enables boundary segmentation in the absence of luminance or contrast cues. , 2014, Journal of vision.

[76] N. Issa,et al. Subcortical Representation of Non-Fourier Image Features , 2010, The Journal of Neuroscience.

[77] C. Baker,et al. Processing of second-order stimuli in the visual cortex. , 2001, Progress in brain research.

[78] Roger W Li,et al. Perceptual learning improves efficiency by re-tuning the decision 'template' for position discrimination , 2004, Nature Neuroscience.

[79] Peter Neri,et al. How inherently noisy is human sensory processing? , 2010, Psychonomic bulletin & review.

[80] Gouki Okazawa,et al. Representation of the Material Properties of Objects in the Visual Cortex of Nonhuman Primates , 2014, The Journal of Neuroscience.

[81] James A. Bednar,et al. Model Constrained by Visual Hierarchy Improves Prediction of Neural Responses to Natural Scenes , 2016, PLoS Comput. Biol..

[82] Mark W. Greenlee,et al. Comparison of fMRI responses during discrimination under certainty and uncertainty conditions , 2002 .

[83] Nicolaas Prins,et al. The psychometric function: the lapse rate revisited. , 2012, Journal of vision.

[84] Michael S. Landy,et al. Visual perception of texture , 2002 .

[85] D G Pelli,et al. The VideoToolbox software for visual psychophysics: transforming numbers into movies. , 1997, Spatial vision.

[86] Frederick A. A. Kingdom,et al. Orientation- and frequency-modulated textures at low depths of modulation are processed by off-orientation and off-frequency texture mechanisms , 2010 .

[87] Andrew T. Smith,et al. Evidence for separate motion-detecting mechanisms for first- and second-order motion in human vision , 1994, Vision Research.

[88] D H Brainard,et al. The Psychophysics Toolbox. , 1997, Spatial vision.

[89] A J Schofield,et al. What Does Second-Order Vision See in an Image? , 2000, Perception.

[90] Richard F Murray,et al. Classification images: A review. , 2011, Journal of vision.

[91] Yaniv Morgenstern,et al. Local Visual Energy Mechanisms Revealed by Detection of Global Patterns , 2012, The Journal of Neuroscience.

[92] C. Baker,et al. Functional Organization of Envelope-Responsive Neurons in Early Visual Cortex: Organization of Carrier Tuning Properties , 2012, The Journal of Neuroscience.

[93] C. Baker,et al. Phase-Dependent Interactions in Visual Cortex to Combinations of First- and Second-Order Stimuli , 2016, The Journal of Neuroscience.

[94] Josée Rivest,et al. Localizing contours defined by more than one attribute , 1996, Vision Research.

[95] D. Heeger. Normalization of cell responses in cat striate cortex , 1992, Visual Neuroscience.

[96] Zijiang J. He,et al. Seeing grating-textured surface begins at the border. , 2011, Journal of vision.

[97] A. E. Hoerl,et al. Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[98] Liam Paninski,et al. Statistical models for neural encoding, decoding, and optimal stimulus design. , 2007, Progress in brain research.

[99] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[100] Michael S. Landy,et al. Pooling of first-order inputs in second-order vision , 2013, Vision Research.

[101] N. Graham,et al. Spatial-frequency- and orientation-selectivity of simple and complex channels in region segregation , 1993, Vision Research.

[102] Li Fei-Fei,et al. Visual categorization is automatic and obligatory: evidence from Stroop-like paradigm. , 2014, Journal of vision.

[103] Keith A May,et al. Optimal edge filters explain human blur detection. , 2012, Journal of vision.

[104] Zhi-Yong Ran,et al. Parameter Identifiability in Statistical Machine Learning: A Review , 2017, Neural Computation.

[105] M. Landy,et al. Combination of texture and color cues in visual segmentation , 2012, Vision Research.

[106] D. Field,et al. Sensitivity to contrast histogram differences in synthetic wavelet-textures , 2001, Vision Research.

[107] Wilson S. Geisler,et al. Decision-variable correlation , 2018, Journal of vision.

[108] M. Fabre-Thorpe,et al. A need for more information uptake but not focused attention to access basic-level representations. , 2012, Journal of vision.

[109] Curtis L. Baker,et al. Texture sparseness, but not local phase structure, impairs second-order segmentation , 2013, Vision Research.

[110] Keith Rayner,et al. The mask-onset delay paradigm and the availability of central and peripheral visual information during scene viewing. , 2012, Journal of vision.

[111] M. Georgeson,et al. Sensitivity to modulations of luminance and contrast in visual white noise: separate mechanisms with similar behaviour , 1999, Vision Research.

[112] Kenneth Knoblauch,et al. Classification images estimated by generalized additive models , 2010 .

[113] Isabelle Mareschal,et al. A cortical locus for the processing of contrast-defined contours , 1998, Nature Neuroscience.

[114] G. Schwarz. Estimating the Dimension of a Model , 1978 .

[115] C. Baker,et al. First- and second-order information in natural images: a filter-based approach to image statistics. , 2004, Journal of the Optical Society of America. A, Optics, image science, and vision.

[116] Jack M. Loomis. Using immersive virtual reality to study visual space perception, visual control of locomotion, and visually-based navigation , 2002 .

[117] A. Norcia,et al. Representation of Maximally Regular Textures in Human Visual Cortex , 2016, The Journal of Neuroscience.

[118] S. Dakin,et al. Sensitivity to contrast modulation depends on carrier spatial frequency and orientation , 2000, Vision Research.

[119] Refractor. Vision , 2000, The Lancet.