Computer Vision Talks: Face-Detection with OpenCV

OpenCV uses machine learning algorithm to search for faces within a picture. A human face can be thought as made up of thousands of small features/characteristics. A typical approach or face detection would involve checking for these thousands of small features and if maximum possible features are found, the region should be classified as 'Face'.

Now, the Machine learning algorithm that we will discuss here, is called "Viola-JonesAlgorithm". This method was proposed by Paul Viola and Michael Jones in their paper, "Rapid Object Detection using a Boosted Cascade of Simple Features" in 2001. It is a Haar feature-based cascade classifiers is an effective object detection method. We will be separately discussing the algorithm in detail. Here, I will just brief you though the steps and will explain a basic code to have the face detection code working on your deskop!

Basics:

Initially, the algorithm needs a lot of positive images (images of faces) and negative images (images without faces) to train the classifier. Then we need to extract features from it. Haar features shown in image are used to extract features. Each feature is a single value obtained by subtracting sum of pixels under white rectangle from sum of pixels under black rectangle.

For example, consider the image below. Top row shows two good features. The first feature selected seems to focus on the property that the region of the eyes is often darker than the region of the nose and cheeks. The second feature selected relies on the property that the eyes are darker than the bridge of the nose. There are multiple such features!

We apply each and every feature on all the training images. For each feature, it finds the best threshold which will classify the faces to positive and negative. But obviously, there will be errors or misclassifications. We select the features with minimum error rate, which means they are the features that best classifies the face and non-face images. (We would not be going into details here.)

Final classifier is a weighted sum of these weak classifiers. It is called weak because it alone can't classify the image, but together with others forms a strong classifier. Combining various such weak classifiers results in around 6000 features. So now you take an image. Divide image into smaller windows. Apply 6000 features to every window. Check if it is face or not.

Wouldn’t this step make the process very much time consuming? I am sure this question would have been hovering in your mind as you read the step! Instead of applying all the 6000 features on a window, we group the features into different stages of classifiers and apply one-by-one. (Normally first few stages will contain very less number of features). If a window fails the first stage, discard it. We don't consider remaining features on it. If it passes, apply the second stage of features and continue the process. The window which passes all stages is a face region.

So this is a simple intuitive explanation of how Viola-Jones face detection works.

Cascades in practice:

Though the theory may sound complicated, in practice it is quite easy. The cascades themselves are just a bunch of XML files that contain OpenCV data used to detect objects. You initialize your code with the cascade you want, and then it does the work for you. Since face detection is such a common case, OpenCV comes with a number of built-in cascades for detecting everything from faces to eyes to hands and legs. There are even cascades for non-human things.

Face-Detection in OpenCV:

Here I will assume that you have OpenCV libraries built on your machine. If you have not or you are struggling with it, you could look into Installing OpenCV.

Also, we will be using C++ with OpenCV for Ubuntu to achieve face detection. I use Eclipse-CDT for writing code. You can see configuring Eclipse with OpenCV if you want to use Eclipse. You can use any other IDE of your choice too. Make sure you link the OpenCV libraries to your project else you are bound to see loads of errors. Check this out if you want to know how to get them linked.

The OpenCV C++ code:

Code Explained:

First, we load the pre-built classifier 'haarcascade_frontalface_alt.xml' using the load() function into a variable of type CascadeClassifier. Then the image for which you want the face to be detected is read into a variable of Mat datatype. In my case I am using Mat img.

Now, the image is converted to gray image using cvtColor function where the first argument is the color Image(input), the second argument is the gray Image(output) and the final argument is the type of conversion(in our case, we are converting a BGR i.e color image to gray)

Now, detectMultiScale() helps us detect the faces. The first arguments to this function is the input image which has to be gray image. The second argument is the vector which will store the details of faces such as its location, its dimensions etc. The next is scaleFactor of type double which is set to 1.3 in my case. You can play around with this value and see results for yourself. The final argument is the minNeighbours which is set to 1.5. Again, experimenting with it would help you set it to correct value for your case.

The for loop is used to locate the co-ordinates of the face detected and draw a rectangle around the face.We also crop the face image and store it into the Mat face_image.

The namedWindow is typically used to show the images.

Now, compile and run this code and you can expect something output to be like: