A visual analytical approach for transfer learning in classification

Abstract Classification can be highly challenging when the dataset is extremely large, or when the training data in the underlying domain are difficult to obtain. One feasible solution to this challenge is transfer learning, which extracts the knowledge from source tasks and applies the knowledge to target tasks. Extant transfer learning schemes typically assume that similarities between the source task and the target task to some degree. This assumption does not hold in certain actual applications; analysts unfamiliar with the learning strategy can be frustrated by the complicated transfer relations and the non-intuitive transfer process. This paper presents a suite of visual communication and interaction techniques to support the transfer learning process. Furthermore, a pioneering visual-assisted transfer learning methodology is proposed in the context of classification. Our solution includes a visual communication interface that allows for comprehensive exploration of the entire knowledge transfer process and the relevance among tasks. With these techniques and the methodology, the analysts can intuitively choose relevant tasks and data, as well as iteratively incorporate their experience and expertise into the analysis process. We demonstrate the validity and efficiency of our visual design and the analysis approach with examples of text classification.

[1]  James Davey,et al.  Guiding feature subset selection with an interactive visualization , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[2]  William Ribarsky,et al.  iPCA: An Interactive System for PCA‐based Visual Analytics , 2009, Comput. Graph. Forum.

[3]  Thomas Ertl,et al.  Visual Classifier Training for Text Document Retrieval , 2012, IEEE Transactions on Visualization and Computer Graphics.

[4]  Kwan-Liu Ma,et al.  Visual cluster exploration of web clickstream data , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[5]  Thomas Schultz,et al.  Open-Box Spectral Clustering: Applications to Medical Image Analysis , 2013, IEEE Transactions on Visualization and Computer Graphics.

[6]  Jaegul Choo,et al.  iVisClassifier: An interactive visual analytics system for classification based on supervised dimension reduction , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[7]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[8]  D. Gentner,et al.  Reasoning and learning by analogy. , 1997, The American psychologist.

[9]  Chris North,et al.  Observation-level interaction with statistical models for visual analytics , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[10]  Silvia Miksch,et al.  Visual Methods for Analyzing Probabilistic Classification Data , 2014, IEEE Transactions on Visualization and Computer Graphics.

[11]  Qiang Yang,et al.  Co-clustering based classification for out-of-domain documents , 2007, KDD '07.

[12]  Denis Lalanne,et al.  Surveying the complementary role of automatic data analysis and visualization in knowledge discovery , 2009, VAKD '09.

[13]  Matthew O. Ward,et al.  Managing discoveries in the visual analytics process , 2007, SKDD.

[14]  Marc Streit,et al.  Opening the Black Box: Strategies for Increased User Involvement in Existing Algorithm Implementations , 2014, IEEE Transactions on Visualization and Computer Graphics.

[15]  Deborah F. Swayne,et al.  Data Visualization With Multidimensional Scaling , 2008 .

[16]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[17]  Harald Piringer,et al.  A Partition-Based Framework for Building and Validating Regression Models , 2013, IEEE Transactions on Visualization and Computer Graphics.

[18]  Santosh S. Vempala,et al.  Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[19]  Thomas E. Potok,et al.  Guided text analysis using adaptive visual analytics , 2012, Visualization and Data Analysis.

[20]  Michael Granitzer,et al.  User-Based Active Learning , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[21]  Ira Assent,et al.  Morpheus: interactive exploration of subspace clustering , 2008, KDD.

[22]  Ben Shneiderman,et al.  Tree-maps: a space-filling approach to the visualization of hierarchical information structures , 1991, Proceeding Visualization '91.

[23]  Matthew O. Ward,et al.  Measuring Data Abstraction Quality in Multiresolution Visualizations , 2006, IEEE Transactions on Visualization and Computer Graphics.

[24]  Thomas Ertl,et al.  ScatterBlogs2: Real-Time Monitoring of Microblog Messages through User-Guided Filtering , 2013, IEEE Transactions on Visualization and Computer Graphics.

[25]  Gavriel Salvendy,et al.  Design and evaluation of visualization support to facilitate decision trees classification , 2007, Int. J. Hum. Comput. Stud..

[26]  Gunther Heidemann,et al.  Efficient annotation of image data sets for computer vision applications , 2012, VIGTA '12.

[27]  Desney S. Tan,et al.  CueFlik: interactive concept learning in image search , 2008, CHI.

[28]  Jarke J. van Wijk,et al.  BaobabView: Interactive construction and analysis of decision trees , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[29]  M. Shahriar Hossain,et al.  Scatter/Gather Clustering: Flexibly Incorporating User Feedback to Steer Clustering Results , 2012, IEEE Transactions on Visualization and Computer Graphics.

[30]  Eric Eaton,et al.  Interactive Learning Using Manifold Geometry , 2010, AAAI Fall Symposium: Manifold Learning and Its Applications.

[31]  Desney S. Tan,et al.  EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers , 2009, CHI.

[32]  Klaus Mueller,et al.  ClusterSculptor: A Visual Analytics Tool for High-Dimensional Data , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[33]  Carla E. Brodley,et al.  Dis-function: Learning distance functions interactively , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[34]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[35]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[36]  Hans-Peter Kriegel,et al.  Visual classification: an interactive approach to decision tree construction , 1999, KDD '99.

[37]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[38]  Burr Settles,et al.  Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances , 2011, EMNLP.

[39]  Dino Pedreschi,et al.  Interactive visual clustering of large collections of trajectories , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[40]  Magdalena Jankowska,et al.  Relative N-gram signatures: Document visualization at the level of character N-grams , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[41]  Gunther Heidemann,et al.  Inter-active learning of ad-hoc classifiers for video visual analytics , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[42]  Jimeng Sun,et al.  DICON: Interactive Visual Analysis of Multidimensional Clusters , 2011, IEEE Transactions on Visualization and Computer Graphics.

[43]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[44]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[45]  Eric Eaton,et al.  Modeling Transfer Relationships Between Learning Tasks for Improved Inductive Transfer , 2008, ECML/PKDD.

[46]  Rosane Minghim,et al.  Improved Similarity Trees and their Application to Visual Data Classification , 2011, IEEE Transactions on Visualization and Computer Graphics.

[47]  James Fogarty,et al.  Regroup: interactive machine learning for on-demand group creation in social networks , 2012, CHI.