Optimizing reservoir features in oil exploration management based on fusion of soft computing

This paper introduces concepts and algorithms for feature selection, surveys existing feature selection algorithms for classification and clustering, groups and compares different algorithms by a categorizing framework based on search strategies, evaluation criteria, and data mining tasks and provides guidelines in selecting feature selection algorithms. Search strategies include complete ones, sequential ones and random ones. Evaluation criteria includes filter, wrapper and hybrid. Data mining tasks include classification and clustering. Then, a feature selecting platform is proposed as an intermediate step based on the data and requirement of the task. According to the platform and categorizing framework, some appropriate algorithms are compared. At last, an experiment based on data oilsk81, oilsk83, oilsk85 wells of Jianghan oil fields in China was operated by using one of the appropriate algorithms. This algorithm utilizes fusion of soft computing methods to distinguish the key features of reservoir oil-bearing formation and establishes a model with fusion of soft computing methods to forecast these key features. The following part is the process: Firstly, use genetic algorithm (GA) and fuzzy c-means algorithm (GA-FCM) to reduce well log features of oil-bearing formation and to obtain the key features that can describe oil-bearing formation of reservoir. Secondly, fuse genetic algorithm with BP neural network (GA-BP) to construct the fusion model that forecasts these key features. GA-BP searches the inputs and optimal number nodes of hidden layer of BP neural network through GA to choose the optimal structure of BP neural network forecasting model. Then test effectiveness of the forecasting model with recognition accuracy of testing samples. Finally, the optimal model for forecasting key features can be obtained.

[1]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2]  Feng Guo-qing APPLICATION OF FUZZY CLOSENESS DEGREE IN RESERVOIR RECOGNITION , 1999 .

[3]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[4]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[5]  Haleh Vafaie,et al.  Feature Selection Methods: Genetic Algorithms vs. Greedy-like Search , 2009 .

[6]  Li Jin-ling Optimal Number of Clusters and the Best Partition in Fuzzy C-mean , 2005 .

[7]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[8]  Huan Liu,et al.  Feature Selection and Classification - A Probabilistic Wrapper Approach , 1996, IEA/AIE.

[9]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[10]  Carla E. Brodley,et al.  Feature Subset Selection and Order Identification for Unsupervised Learning , 2000, ICML.

[11]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[13]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[14]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[15]  Andrew W. Moore,et al.  Efficient Algorithms for Minimizing Cross Validation Error , 1994, ICML.

[16]  Claire Cardie,et al.  Using Decision Trees to Improve Case-Based Learning , 1993, ICML.

[17]  Yao Kang-ze A Survey of Feature Selection , 2005 .

[18]  M. Dash,et al.  Feature selection via set cover , 1997, Proceedings 1997 IEEE Knowledge and Data Engineering Exchange Workshop.

[19]  Ashwin Ram,et al.  Efficient Feature Selection in Conceptual Clustering , 1997, ICML.

[20]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[21]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[22]  Paul E. Utgoff,et al.  Randomized Variable Elimination , 2002, J. Mach. Learn. Res..

[23]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[25]  Ting Liu,et al.  An Improved Genetic k-means Algorithm for Optimal Clustering , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[26]  Huan Liu,et al.  Feature Selection for Clustering , 2000, Encyclopedia of Database Systems.

[27]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[28]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[29]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[30]  Leon Bobrowski Feature selection based on some homogeneity coefficient , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[31]  Pavel Pudil,et al.  Novel Methods for Subset Selection with Respect to Problem Knowledge , 1998, IEEE Intell. Syst..

[32]  Jeffrey C. Schlimmer,et al.  Efficiently Inducing Determinations: A Complete and Systematic Search Algorithm that Uses Optimal Pruning , 1993, ICML.

[33]  Kwang-Ting Cheng,et al.  Fundamentals of algorithms , 2009 .

[34]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[35]  Jörg Kindermann,et al.  Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? , 2002, Machine Learning.

[36]  James C. Bezdek,et al.  Optimization of clustering criteria by reformulation , 1995, IEEE Trans. Fuzzy Syst..

[37]  Salvatore J. Stolfo,et al.  Adaptive Intrusion Detection: A Data Mining Approach , 2000, Artificial Intelligence Review.

[38]  Jack Sklansky,et al.  Feature Selection for Automatic Classification of Non-Gaussian Data , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[39]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[40]  James C. Bezdek,et al.  Clustering with a genetically optimized approach , 1999, IEEE Trans. Evol. Comput..

[41]  Wayne Niblack,et al.  A modeling approach to feature selection , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[42]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[43]  Anthony N. Mucciardi,et al.  A Comparison of Seven Techniques for Choosing Subsets of Pattern Recognition Properties , 1971, IEEE Transactions on Computers.

[44]  Huan Liu,et al.  Customer Retention via Data Mining , 2000, Artificial Intelligence Review.

[45]  Vincenzo Catania,et al.  Psychology with soft computing: An integrated approach and its applications , 2008, Appl. Soft Comput..

[46]  Huan Liu,et al.  Feature selection for clustering - a filter solution , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[47]  Haixiang Guo,et al.  A new method of soft computing to estimate the economic contribution rate of education in China , 2008, Appl. Soft Comput..

[48]  Fred Aminzadeh,et al.  Applications of AI and soft computing for challenging problems in the oil industry , 2005 .

[49]  Manoranjan Dash,et al.  Dimensionality reduction of unsupervised data , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[50]  Moshe Ben-Bassat,et al.  35 Use of distance measures, information measures and error bounds in feature evaluation , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[51]  Sanmay Das,et al.  Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.

[52]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[53]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .