Eye gaze detection

Heyy!

Sorry for being inactive for a while. Here comes my next post about my summer experience.

Having tried a lot with OpenCV, I gathered some confidence to work on a more challenging problem. So, I decided to work on Eye gaze detection over summer apart from my internship at Clozerr Inc, where I worked as a FullStack developer. Clozerr is an IIT Madras – incubated startup which fetches offers in popular restaurants. So, I worked on setting up endpoints for integration of geofences, handling user data and some social network integrations.

Hah! Back to our original discussion. There are a lot of implementations of eye detection, pupil localisation, but I didn’t find a concrete method of gaze computation. After a literature survey, I came to know about the difficulty of it. Anyways, I gave it a shot.

Filtering research papers and implementing some algorithms from it was my first task. I broadly categorized the problem into 3 parts.

  • Pupil localisation
  • Face pose estimation
  • Combining the above two, so as to get a measure of the person’s gaze.

How I did it?

As everyone knows, OpenCV’s default haar face cascade model is a bit buggy and gives a lot of false detections. After searching a bit, I came to know about Dlib. It has a built in HOG based trainer and it detects face key points with a decent accuracy. This post gives better information about how Dlib face key point prediction works.

Using some popular research papers on pupil localisation, I was able to detect pupil ( even though the detections in consecutive frames had some deviations and hence output points flickered a lot ).

Solving the flickering problem – Kalman filter to the rescue

Using Kalman filter on the current state ( combination of position and velocity of the pupil point ) solved the flickering problem. The following articles explain the working of a Kalman filter in simpler ways,

Kalman filter for dummies

How a kalman filter works on pictures

Computing facial normal

Now, the actual use of Dlib comes into play. Based on the 68 keypoints which Dlib gives for a detected face, I was able to get a measure of facial normal. You might wonder how this is possible? But believe me. Even I was surprised when I saw the output after implementing few research works and blog posts.

Estimating Gaze

Having localised pupil and estimating facial normal, the final task is to compute the gaze vector of the person. Doing some vector algebra with pupil location and facial normal, and fiddling with face geometry a decent logic was developed which returns gaze. You can refer the source code for more details on how I did it.

Output

After working for about a month and a half on this, I got good results on gaze computation. The facial normal and the gaze vector are predicted with a good accuracy. The gaze vector is sometimes erroneous when there are sudden fluctuations in the position of pupil. But after a lag of few frames it gets stabilized. Kalman filter is applied on gaze vector also, i.e the current state of the system includes gaze vector also 🙂

Sample output images are embedded below. White lines indicate gaze emerging from pupil and black lines indicate facial normal emerging from center of nose ( Thanks to Dlib 😉 )

sample_output_1

sample_output_2

The project source can be found here.

Haar car classifier

Heyy!

After referring many online articles about basic image operations and image manipulation techniques, I sat in front of my laptop, bored and had no idea what to do. Suddenly, ideas like face detection and object detection popped out of nowhere.

As I was familiar with OpenCV, I wanted to know what it offers in this context. Luckily, I found few articles/docs related to this.

Haar cascade

I finally decided to train a haar classifier on my own for a “popular” object ( easy to get datasets 😉 ), which turned out to be CARS. Few minutes of search got me a good dataset for cars. So, I started looking for resources on how to train a classifier.

This blog post provided a detailed tutorial on how to train a haar classifier. I had to make minor changes here and there as there were some problems with locations of some executables and the XML generated was incompatible with my version of OpenCV. Apart from that, the training was successful and quick ( I used an Amazon-EC2-serve ).

For a detailed description of the tools which OpenCV offers, have a look into this page.

The final model generalised well on real world images. But there were some false detections here and there. For instance it classified regions near mouth/nose to be cars 😛

The project source can be found here.

CVJyo – A computer vision approach to Palmistry

Heyy!

I am back with my next post :). As the title says, it is a program which predicts your horoscope based on an image of your hand. By the way, CVJyo is an acronym for Computer Vision based Jyothish 😉

To keep the program simple, I restricted the predictions to three measures – Heart level, Knowledge level and life level ( gives an overall impression about your life ). Since it was my second adventure with OpenCV, I thought I will try a variety of methods to do this. Some of them include watershed segmentation, combination of Sobel/Laplacian operators, distance transform and some raw contour processing.

The code was written in C++/Python depending on my mood. Some of the above listed methods worked well for some images but didn’t generalize that well. Finally a combination of the above methods along with some hardcoded fixes, a final solution was developed. This worked well for a variety of images taken under decent lighting conditions.

The project source can be found here.

PS: I added a base score for the three levels ( just in case the predicted values are less than a threshold ), so that the user doesn’t become upset 😛