This paper describes IBCOW (Image-based Classification of Objectionable Websites), a system capable of classifying a website as objectionable or benign based on image content. The system uses WIPEℳ (Wavelet Image Pornography Elimination) and statistics to provide robust classification of on-line objectionable World Wide Web sites. Semantically-meaningful feature vector matching is carried out so that comparisons between a given on-line image and images marked as ”objectionable” and ”benign” in a training set can be performed efficiently and effectively in the WIPE module. If more than a certain number of images sampled from a site is found to be objectionable, then the site is considered to be objectionable. The statistical analysis for determining the size of the image sample and the threshold number of objectionable images is given in this paper. The system is practical for real-world applications, classifying a Web site at a speed of less than 2 minutes each, including the time to compute the feature vector for the images downloaded from the site, on a Pentium Pro PC. Besides its exceptional speed, it has demonstrated 97% sensitivity and 97% specificity in classifying a Web site based solely on images. Both the sensitivity and the specificity in real-world applications is expected to be higher because our performance evaluation is relatively conservative and surrounding text can be used to assist the classification process.
[1]
James Ze Wang,et al.
System for screening objectionable images
,
1998,
Comput. Commun..
[2]
James Ze Wang,et al.
System for Screening Objectionable Images Using Daubechies' Wavelets and Color Histograms
,
1997,
IDMS.
[3]
James Ze Wang,et al.
Wavelet-based image indexing techniques with partial sketch retrieval capability
,
1997,
Proceedings of ADL '97 Forum on Research and Technology. Advances in Digital Libraries.
[4]
Hayit Greenspan,et al.
Finding Pictures of Objects in Large Collections of Images
,
1996,
Object Representation in Computer Vision.
[5]
Amarnath Gupta,et al.
Visual information retrieval
,
1997,
CACM.
[6]
Giuseppe Riva,et al.
Treating body-image disturbances
,
1997,
CACM.
[7]
Shih-Fu Chang,et al.
VisualSEEk: a fully automated content-based image query system
,
1997,
MULTIMEDIA '96.
[8]
James Ze Wang,et al.
Content-based image indexing and searching using Daubechies' wavelets
,
1998,
International Journal on Digital Libraries.
[9]
S. Leigh,et al.
Probability and Random Processes for Electrical Engineering
,
1989
.
[10]
David Salesin,et al.
Fast multiresolution image querying
,
1995,
SIGGRAPH.
[11]
David A. Forsyth,et al.
Finding Naked People
,
1996,
ECCV.