Detection of sand dunes on Mars using a regular vine-based classification approach

Abstract This paper deals with the problem of detecting sand dunes from remotely sensed images of the surface of Mars. We build on previous approaches that propose methods to extract informative features for the classification of the images. The intricate correlation structure exhibited by these features motivates us to propose the use of probabilistic classifiers based on R-vine distributions to address this problem. R-vines are probabilistic graphical models that combine a set of nested trees with copula functions and are able to model a wide range of pairwise dependencies. We investigate different strategies for building R-vine classifiers and compare them with several state-of-the-art classification algorithms for the identification of Martian dunes. Experimental results show the adequacy of the R-vine-based approach to solve classification problems where the interactions between the variables are of a different nature between classes and play an important role in that the classifier can distinguish the different classes.

[1]  Dirk Van,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[2]  Jun Yan,et al.  Comparison of three semiparametric methods for estimating dependence parameters in copula models , 2010 .

[3]  Pedro Pina,et al.  Automated Detection of Martian Dune Fields , 2011, IEEE Geoscience and Remote Sensing Letters.

[4]  Chiranjib Bhattacharyya,et al.  Vine copulas for mixed data : multi-view clustering for mixed data beyond meta-Gaussian dependencies , 2017, Machine Learning.

[5]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[6]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[7]  Chris H. Hugenholtz,et al.  Spatial analysis of sand dunes with a new global topographic dataset: new approaches and opportunities , 2010 .

[8]  H. Joe Families of $m$-variate distributions with given margins and $m(m-1)/2$ bivariate dependence parameters , 1996 .

[9]  T. Hare,et al.  Mars Global Digital Dune Database and initial science results , 2007 .

[10]  N. Lancaster,et al.  Extraterrestrial dunes: An introduction to the special issue on planetary dune systems , 2010 .

[11]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[12]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[13]  M. Richardson,et al.  Aeolian processes in Proctor Crater on Mars: Mesoscale modeling of dune‐forming winds , 2005 .

[14]  Natalia Belgorodski,et al.  Selecting pair-copula families for regular vines with application to the multivariate analysis of European stock market indices , 2010 .

[15]  Satishs Iyengar,et al.  Multivariate Models and Dependence Concepts , 1998 .

[16]  C. Czado,et al.  Truncated regular vines in high dimensions with application to financial data , 2012 .

[17]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[18]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[19]  A. Frigessi,et al.  Pair-copula constructions of multiple dependence , 2009 .

[20]  M. Bishop,et al.  Nearest neighbor analysis of mega-barchanoid dunes, Ar Rub' al Khali, sand sea: The application of geographical indices to the understanding of dune field self-organization, maturity and environmental change , 2010 .

[21]  Claudia Czado,et al.  Representing Sparse Gaussian DAGs as Sparse R-Vines Allowing for Non-Gaussian Dependence , 2016, 1604.04202.

[22]  George W. Bohrnstedt,et al.  OF RANDOM VARIABLES , 2016 .

[23]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[24]  Claudia Czado,et al.  Pair-Copula Constructions of Multivariate Copulas , 2010 .

[25]  Gregory W. Corder,et al.  Nonparametric Statistics : A Step-by-Step Approach , 2014 .

[26]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[27]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[28]  Diana Carrera,et al.  Vine Estimation of Distribution Algorithms with Application to Molecular Docking , 2012 .

[29]  Marta Soto,et al.  copulaedas: An R Package for Estimation of Distribution Algorithms Based on Copulas , 2012, ArXiv.

[30]  T. Bedford,et al.  Vines: A new graphical model for dependent random variables , 2002 .

[31]  Eamonn J. Keogh,et al.  Learning augmented Bayesian classifiers: A comparison of distribution-based and classification-based approaches , 1999, AISTATS.

[32]  C. Genest,et al.  Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask , 2007 .

[33]  Chih-Jen Lin,et al.  Dual coordinate descent methods for logistic regression and maximum entropy models , 2011, Machine Learning.

[34]  L. Fenton,et al.  Southern high latitude dune fields on Mars: Morphology, aeolian inactivity, and climate change , 2010 .

[35]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[36]  Bill Ravens,et al.  An Introduction to Copulas , 2000, Technometrics.

[37]  C. Genest,et al.  A semiparametric estimation procedure of dependence parameters in multivariate families of distributions , 1995 .

[38]  Edwin Dinwiddie McKee,et al.  A study of global sand seas , 1979 .

[39]  M. T. Barata,et al.  Object-based Dune Analysis: Automated dune mapping and pattern characterization for Ganges Chasma and Gale crater, Mars , 2015 .

[40]  José Antonio Lozano,et al.  Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[42]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[43]  J. Hintze,et al.  Violin plots : A box plot-density trace synergism , 1998 .

[44]  Jose E. Gomez-Gonzalez,et al.  Latin American Exchange Rate Dependencies: A Regular Vine Copula Approach , 2012 .

[45]  Gian Gabriele Ori,et al.  Ripple migration and dune activity on Mars: Evidence for dynamic wind processes , 2010 .

[46]  Richard D. Deveaux,et al.  Applied Smoothing Techniques for Data Analysis , 1999, Technometrics.

[47]  Yuhui Chen A COPULA-BASED SUPERVISED LEARNING CLASSIFICATION FOR CONTINUOUS AND DISCRETE DATA , 2021 .

[48]  Nir Friedman,et al.  Building Classifiers Using Bayesian Networks , 1996, AAAI/IAAI, Vol. 2.

[49]  Ulf Schepsmeier Maximum likelihood estimation of C-vine pair-copula constructions based on bivariate copulas from different families , 2010 .

[50]  Roberto Santana,et al.  Vine copula classifiers for the mind reading problem , 2016, Progress in Artificial Intelligence.

[51]  Marta Soto,et al.  Estimation of Distribution Algorithms Based on Copulas , 2015 .

[52]  Pedro Pina,et al.  Advances in automated detection of sand dunes on Mars , 2013 .

[53]  Eike Christian Brechmann,et al.  Modeling Dependence with C- and D-Vine Copulas: The R Package CDVine , 2013 .

[54]  Claudia Czado,et al.  Selecting and estimating regular vine copulae and application to financial returns , 2012, Comput. Stat. Data Anal..

[55]  Collin Carbno,et al.  Uncertainty Analysis With High Dimensional Dependence Modelling , 2007, Technometrics.

[56]  Abe Sklar,et al.  Random variables, joint distribution functions, and copulas , 1973, Kybernetika.

[57]  Claudia Czado,et al.  Simplified pair copula constructions - Limitations and extensions , 2013, J. Multivar. Anal..

[58]  Robert M. Haberle,et al.  Aeolian Processes and their Effects on Understanding the Chronology of Mars , 2001 .

[59]  Roger M. Cooke,et al.  Probability Density Decomposition for Conditionally Dependent Random Variables Modeled by Vines , 2001, Annals of Mathematics and Artificial Intelligence.

[60]  James R. Zimbelman,et al.  Latitude-dependent nature and physical characteristics of transverse aeolian ridges on Mars , 2004 .

[61]  Claudia Czado,et al.  Model selection in sparse high-dimensional vine copula models with an application to portfolio risk , 2019, J. Multivar. Anal..

[62]  M. Sklar Fonctions de repartition a n dimensions et leurs marges , 1959 .

[63]  Randal S. Olson,et al.  Data-driven advice for applying machine learning to bioinformatics problems , 2017, PSB.

[64]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[65]  Kjersti Aas Modelling the dependence structure of financial assets : A survey of four copulas , 2004 .