Hierarchical Matching Network for Crime Classification

Automatic crime classification is a fundamental task in the legal field. Given the fact descriptions, judges first determine the relevant violated laws, and then the articles. As laws and articles are grouped into a tree-shaped hierarchy (i.e., laws as parent labels, articles as children labels), this task can be naturally formalized as a two layers' hierarchical multi-label classification problem. Generally, the label semantics (i.e., definition of articles) and the hierarchical structure are two informative properties for judges to make a correct decision. However, most previous methods usually ignore the label structure and feed all labels into a flat classification framework, or neglect the label semantics and only utilize fact descriptions for crime classification, thus the performance may be limited. In this paper, we formalize crime classification problem into a matching task to address these issues. We name our model as Hierarchical Matching Network (HMN for short). Based on the tree hierarchy, HMN explicitly decomposes the semantics of children labels into the residual and alignment components. The residual components keep the unique characteristics of each individual children label, while the alignment components capture the common semantics among sibling children labels, which are further aggregated as the representation of their parent label. Finally, given a fact description, a co-attention metric is applied to effectively match the relevant laws and articles. Experiments on two real-world judicial datasets demonstrate that our model can significantly outperform the state-of-the-art methods.

[1]  Claudio Gentile,et al.  Incremental Algorithms for Hierarchical Classification , 2004, J. Mach. Learn. Res..

[2]  Kitsana Waiyamai,et al.  Hierarchical Multi-label Associative Classification (HMAC) using negative rules , 2010, 9th IEEE International Conference on Cognitive Informatics (ICCI'10).

[3]  Robert E. Schapire,et al.  Hierarchical multi-label prediction of gene function , 2006, Bioinform..

[4]  Zhi-Hua Zhou,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[5]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[6]  Sebastián Ventura,et al.  Multi‐label learning: a review of the state of the art and ongoing research , 2014, WIREs Data Mining Knowl. Discov..

[7]  Saso Dzeroski,et al.  Predicting gene function using hierarchical multi-label decision tree ensembles , 2010, BMC Bioinformatics.

[8]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[9]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[10]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Hierarchical classification of Gene Ontology-based protein functions with neural networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[11]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[12]  Saso Dzeroski,et al.  Decision trees for hierarchical multi-label classification , 2008, Machine Learning.

[13]  Ee-Peng Lim,et al.  Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[14]  Rong Jin,et al.  Title language model for information retrieval , 2002, SIGIR '02.

[15]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[16]  Luis Enrique Sucar,et al.  Hierarchical multilabel classification based on path evaluation , 2016, Int. J. Approx. Reason..

[17]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[18]  Richard Socher,et al.  Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[19]  Zhiyuan Liu,et al.  Automatic Judgment Prediction via Legal Reading Comprehension , 2018, CCL.

[20]  Ellen Riloff,et al.  Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018 , 2018, EMNLP.

[21]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Hierarchical multi-label classification using local neural networks , 2014, J. Comput. Syst. Sci..

[22]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[23]  Alex A. Freitas,et al.  A review of performance evaluation measures for hierarchical classifiers , 2007 .

[24]  Alex Alves Freitas,et al.  Comparing Several Approaches for Hierarchical Classification of Proteins with Decision Trees , 2007, BSB.

[25]  Jimmy J. Lin An exploration of the principles underlying redundancy-based factoid question answering , 2007, TOIS.

[26]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[27]  Jimmy J. Lin,et al.  Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement , 2016, NAACL.

[28]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[29]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[30]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[31]  Xin Li,et al.  Multi-Label Classification with Feature-Aware Non-Linear Label Space Transformation , 2015, IJCAI.

[32]  Dongyan Zhao,et al.  Learning to Predict Charges for Criminal Cases with Legal Basis , 2017, EMNLP.

[33]  W. Bruce Croft,et al.  Semantic Matching by Non-Linear Word Transportation for Information Retrieval , 2016, CIKM.

[34]  Zhi-Hong Deng,et al.  Inter-Weighted Alignment Network for Sentence Pair Modeling , 2017, EMNLP.

[35]  Rodrigo C. Barros,et al.  Hierarchical Multi-Label Classification Networks , 2018, ICML.

[36]  Anna Korhonen,et al.  Initializing neural networks for hierarchical multi-label text classification , 2017, BioNLP.

[37]  Zhiyuan Liu,et al.  Legal Judgment Prediction via Topological Learning , 2018, EMNLP.

[38]  Pengfei Wang,et al.  Modeling Dynamic Pairwise Attention for Crime Classification over Legal Articles , 2018, SIGIR.