Object Detection and Classification Using Machine Learning Techniques: A Comparison of Haar Cascades and Neural Networks

Object recognition and object detection are sub fields of computer vision, the task of giving computers the ability to perceive and respond to the world around them. This is a very useful technology and has many different applications in many different disciplines. Examples of applications include but are not limited to: Use in security by using facial recognition; use in the medical field for classification of cancer types (malignant or benign) or detecting sickle cells in the blood; or it can be used as a research tool to automatically record data (e.g. count the number of trucks that use a particular highway); or used in agriculture or industrialization for quality control; or for entertainment purposes in games that can detect and track a users' movement to control a character in a game. And there are many more examples, in which it can be used. However, just as they are many applications, there are also many methods of implementing these systems, such as Convolutional neural networks, Haar cascades, Scale-Invariant Feature Transformations (SIFT), Histogram of Oriented Gradients (HOG) and several others including combinations of these. Each technique may have different setup procedures and different applications where one may work better than the other. This paper aims to measure and give a comparison of some of these techniques for object detection and tracking, by looking at their training and testing time as well as accuracy. The methods used in this paper was Haar Cascades and Neural Networks. The purpose of this paper is to learn more about these machine learning techniques by comparing the way they work and the way they were implemented, by doing so one can also understand contexts in which to use these systems. The artifacts used in this paper included a robot that can track an object and an application that could classify an image based on a set of objects it was trained on, that is the application can determine what the picture is of based on a set of images it was trained on.