Neural network based framework for goal event detection in soccer videos

In this paper, a neural network based framework for semantic event detection in soccer videos is proposed. The framework provides a robust solution for soccer goal event detection by combining the strength of multimodal analysis and the ability of neural network ensembles to reduce the generalization error. Due to the rareness of the goal events, the bootstrapped sampling method on the training set is utilized to enhance the recall of goal event detection. Then a group of component networks are trained using all the available training data. The precision of the detection is greatly improved via the following two steps. First, a pre-filtering step is employed on the test set to reduce the noisy and inconsistent data, and then an advanced weighting scheme is proposed to intelligently traverse and combine the component network predictions by taking into consideration the prediction performance of each network. A set of experiments are designed to compare the performance of different bootstrapped sampling schemes, to present the strength of the proposed weighting scheme in event detection, and to demonstrate the effectiveness of our framework for soccer goal event detection.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Shih-Fu Chang,et al.  Unsupervised discovery of multilevel statistical video structures using hierarchical hidden Markov models , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[3]  Changsheng Xu,et al.  Automatic mobile sports highlights , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[4]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[5]  P. L. Rosin,et al.  Improving neural network generalisation , 1995, 1995 International Geoscience and Remote Sensing Symposium, IGARSS '95. Quantitative Remote Sensing for Science and Applications.

[6]  Wen-Nung Lie,et al.  Motion-based event detection and semantic classification for baseball sport videos , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[7]  Min Chen,et al.  A multimodal data mining framework for soccer goal detection based on decision tree logic , 2006, Int. J. Comput. Appl. Technol..

[8]  Anders Krogh,et al.  Learning with ensembles: How overfitting can be useful , 1995, NIPS.

[9]  Mohan S. Kankanhalli,et al.  Creating audio keywords for event detection in soccer video , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[10]  Chng Eng Siong,et al.  Automatic replay generation for soccer video broadcasting , 2004, MULTIMEDIA '04.

[11]  Chengcui Zhang,et al.  Innovative Shot Boundary Detection for Video Indexing , 2005 .

[12]  Okan K. Ersoy,et al.  Neural network schemes for detecting rare events in human genomic DNA , 2000, Bioinform..

[13]  Min Chen,et al.  A decision tree-based multimodal data mining framework for soccer goal detection , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[14]  Richard J. Qian,et al.  Detecting semantic events in soccer games: towards a complete solution , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[15]  L. Cooper,et al.  When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[16]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[17]  A. Murat Tekalp,et al.  Automatic Soccer Video Analysis and Summarization , 2003, IS&T/SPIE Electronic Imaging.

[18]  Ah Chung Tsoi,et al.  Lessons in Neural Network Training: Overfitting May be Harder than Expected , 1997, AAAI/IAAI.

[19]  Alberto Del Bimbo,et al.  Taking into Consideration Sports Semantic Annotation of Sports Videos Content-based Multimedia Indexing and Retrieval , 2002 .

[20]  Katsunari Shibata,et al.  Growing neural network for acquisition of 2-layer structure , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[21]  Noel E. O'Connor,et al.  Event detection in field sports video using audio-visual features and a support vector Machine , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Jude W. Shavlik,et al.  Combining the Predictions of Multiple Classifiers: Using Competitive Learning to Initialize Neural Networks , 1995, IJCAI.

[23]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[24]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[25]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[26]  D. Jimenez,et al.  Dynamically weighted ensemble neural networks for classification , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[27]  Kevin J. Cherkauer Human Expert-level Performance on a Scientiic Image Analysis Task by a System Using Combined Artiicial Neural Networks , 1996 .

[28]  Xinguo Yu,et al.  Current and Emerging Topics in Sports Video Processing , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[29]  Kishan G. Mehrotra,et al.  An improved algorithm for neural network classification of imbalanced training sets , 1993, IEEE Trans. Neural Networks.

[30]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Mohan S. Kankanhalli,et al.  Semantic labeling of soccer video , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.