A New Class of Learnable Detectors: CSER - Class Specific Extremal Regions

I : Z → S is image function, where S is a set of values which has total ordering (e.g. pixel intensity) and h : S → N is ordering function: Time Efficiency • The time-complexity of the detection is approximately linear in the number of pixel and a non-optimized implementation runs at about 1 frame per second for a 640× 480 image on a high-end PC. • It is not necessary to recompute features from all points when regions are merged.

[1]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[2]  C. Schmid,et al.  Indexing based on scale invariant interest points , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[3]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..