Learning in High Risk Situations

In this thesis, we address the age old problem of dealing with scenarios in which the significance of making a mistake while taking a classification decision is very high. We also look deeper into how we can take advantage of deep learning techniques to establish state of the art results and study the representations learnt by the deep reject option classifiers. While supervised techniques have their advantage we explored unsupervised learning in real time scenario of catching malicious traders in Stock Exchanges which are establishments where significance of mistakes are very high. Firstly, we propose deep architectures for learning binary instance specific abstain (reject option) classifiers. To show the effectiveness of the proposed approach, we experiment with several real world datasets. We observe that the proposed approach not only performs comparable to the state-of-the-art approaches, it is also robust against label noise. We also provide visualizations to observe the important features learned by the network corresponding to the abstaining decision. Secondly, we look at Stock exchanges. Due to their prominence, stock exchanges are prone to a variety of attacks. Different types of proprietary fraudulent activity detectors are deployed by stock exchanges to analyze the time series data of trader’s activities or the activity of a particular stock to flag potentially malicious transactions while human analysts probe the flagged transactions further. The key issue faced here is that while the number of anomalous transactions identified can run into thousands or tens of thousands, the number of such transactions that can realistically be probed by human analysts would be a small fraction due to resource constraints. The issue therefore reduces to a dynamic resource allocation problem wherein alerts that represent the most malicious transactions need to be mapped to human analysts for further probing across different time intervals. Thirdly, we develop a multiclass deep learning solution with a rejection option. While binary solutions find relevance in scenarios such as cancer detection where ”yes” or ”no” could be the probable options. In scenarios such as autonomous driving, multiple options could be applicable at each instant. So, we propose a multiclass double ramp loss function. We study the properties of the loss function and establish state of the art results on real world datasets.

[1]  Senzhang Wang,et al.  IRDA: Incremental Reinforcement Learning for Dynamic Resource Allocation , 2020, IEEE Transactions on Big Data.

[2]  Naresh Manwani,et al.  RISAN: Robust instance specific deep abstention network , 2021, UAI.

[3]  Naresh Manwani,et al.  Cooperative Monitoring of Malicious Activity in Stock Exchanges , 2021, PAKDD.

[4]  Medhat Moussa,et al.  Deep Learning for Intelligent Transportation Systems: A Survey of Emerging Trends , 2020, IEEE Transactions on Intelligent Transportation Systems.

[5]  Kaan Ozbay,et al.  Identifying Real-World Transportation Applications Using Artificial Intelligence (AI): Summary of Potential Application of AI in Transportation , 2020 .

[6]  Y. Raghu Reddy,et al.  Machine learning techniques for credit risk evaluation: a systematic literature review , 2020, J. Bank. Financial Technol..

[7]  Mohamad Sawan,et al.  Artificial Intelligence in Healthcare: Review and Prediction Case Studies , 2020, Engineering.

[8]  Naresh Manwani,et al.  Online Active Learning of Reject Option Classifiers , 2019, AAAI.

[9]  Ritika Chopra,et al.  Artificial intelligence and effective governance: A review, critique and research agenda , 2020 .

[10]  Michalis Avgerinos Loutsaris,et al.  How Machine Learning is Changing e-Government , 2019, ICEGOV.

[11]  K. Maddulety,et al.  Machine Learning in Banking Risk Management: A Literature Review , 2019, Risks.

[12]  Ran El-Yaniv,et al.  SelectiveNet: A Deep Neural Network with an Integrated Reject Option , 2019, ICML.

[13]  Naresh Manwani,et al.  Sparse Reject Option Classifier Using Successive Linear Programming , 2018, AAAI.

[14]  Jeff A. Bilmes,et al.  Knows When it Doesn’t Know: Deep Abstaining Classifiers , 2018 .

[15]  Sung-Hoon Ahn,et al.  Smart Machining Process Using Machine Learning: A Review and Perspective on Machining Industry , 2018, International Journal of Precision Engineering and Manufacturing-Green Technology.

[16]  M. A. Jabbar,et al.  Machine Learning in Healthcare: A Review , 2018, 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA).

[17]  Ambuj Tewari,et al.  Consistent algorithms for multiclass classification with an abstain option , 2018 .

[18]  Daniel L Rubin,et al.  A curated mammography data set for use in computer-aided detection and diagnosis research , 2017, Scientific Data.

[19]  Ran El-Yaniv,et al.  Selective Classification for Deep Neural Networks , 2017, NIPS.

[20]  Praveen Paruchuri,et al.  Improving Surveillance Using Cooperative Target Observation , 2017, AAAI.

[21]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Mehryar Mohri,et al.  Learning with Rejection , 2016, ALT.

[23]  Klaus-Dieter Thoben,et al.  Machine learning in manufacturing: advantages, challenges, and applications , 2016 .

[24]  Weihong Deng,et al.  Very deep convolutional neural network based image classification using small training sample size , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[25]  Ryota Tomioka,et al.  Norm-Based Capacity Control in Neural Networks , 2015, COLT.

[26]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[27]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[28]  Ramasubramanian Sundararajan,et al.  Double Ramp Loss Based Reject Option Classifier , 2013, PAKDD.

[29]  Sahil Shah,et al.  Predicting stock and stock price index movement using Trend Deterministic Data Preparation and machine learning techniques , 2015, Expert Syst. Appl..

[30]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[31]  Osmar R. Zaïane,et al.  Data Mining Applications for Fraud Detection in Securities Market , 2012, 2012 European Intelligence and Security Informatics Conference.

[32]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[33]  Babis Theodoulidis,et al.  Analysis of stock market manipulations using knowledge discovery techniques applied to intraday trade prices , 2011, Expert Syst. Appl..

[34]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[35]  Ran El-Yaniv,et al.  On the Foundations of Noise-free Selective Classification , 2010, J. Mach. Learn. Res..

[36]  Ramazan Aktas,et al.  Detecting stock-price manipulation in an emerging market: The case of Turkey , 2009, Expert Syst. Appl..

[37]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[38]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[39]  Yves Grandvalet,et al.  Support Vector Machines with a Reject Option , 2008, NIPS.

[40]  Peter L. Bartlett,et al.  Classification with a Reject Option using a Hinge Loss , 2008, J. Mach. Learn. Res..

[41]  Victor S. Sheng,et al.  Cost-Sensitive Learning and the Class Imbalance Problem , 2008 .

[42]  Jon Howell,et al.  Asirra: a CAPTCHA that exploits interest-aligned manual image categorization , 2007, CCS '07.

[43]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[44]  Tadeusz Pietraszek,et al.  Optimizing abstaining classifiers using ROC analysis , 2005, ICML.

[45]  Sean Luke,et al.  Tunably decentralized algorithms for cooperative target observation , 2005, AAMAS '05.

[46]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.

[47]  Rajesh Aggarwal,et al.  Stock Market Manipulation - Theory and Evidence , 2003 .

[48]  Lynne E. Parker,et al.  Distributed Algorithms for Multi-Robot Observation of Multiple Moving Targets , 2002, Auton. Robots.

[49]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[50]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[51]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[52]  Ted E. Senator,et al.  Ongoing management and application of discovered knowledge in a large regulatory organization: a case study of the use and impact of NASD Regulation's Advanced Detection System (RADS) , 2000, KDD '00.

[53]  Mario Vento,et al.  To reject or not to reject: that is the question-an answer in case of neural classifiers , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[54]  Ted E. Senator,et al.  The NASD Regulation Advanced-Detection System (ADS) , 1998, AI Mag..

[55]  Igor Kononenko,et al.  Cost-Sensitive Learning with Neural Networks , 1998, ECAI.

[56]  Franklin Allen,et al.  Stock Price Manipulation, Market Microstructure and Asymmetric Information , 1991 .

[57]  J. Kuelbs Probability on Banach spaces , 1978 .

[58]  C. K. Chow,et al.  On optimum recognition error and reject tradeoff , 1970, IEEE Trans. Inf. Theory.