Heyy!
Sorry for being inactive for a while. Here comes my next post about my summer experience.
Having tried a lot with OpenCV, I gathered some confidence to work on a more challenging problem. So, I decided to work on Eye gaze detection over summer apart from my internship at Clozerr Inc, where I worked as a FullStack developer. Clozerr is an IIT Madras – incubated startup which fetches offers in popular restaurants. So, I worked on setting up endpoints for integration of geofences, handling user data and some social network integrations.
Hah! Back to our original discussion. There are a lot of implementations of eye detection, pupil localisation, but I didn’t find a concrete method of gaze computation. After a literature survey, I came to know about the difficulty of it. Anyways, I gave it a shot.
Filtering research papers and implementing some algorithms from it was my first task. I broadly categorized the problem into 3 parts.
- Pupil localisation
- Face pose estimation
- Combining the above two, so as to get a measure of the person’s gaze.
How I did it?
As everyone knows, OpenCV’s default haar face cascade model is a bit buggy and gives a lot of false detections. After searching a bit, I came to know about Dlib. It has a built in HOG based trainer and it detects face key points with a decent accuracy. This post gives better information about how Dlib face key point prediction works.
Using some popular research papers on pupil localisation, I was able to detect pupil ( even though the detections in consecutive frames had some deviations and hence output points flickered a lot ).
Solving the flickering problem – Kalman filter to the rescue
Using Kalman filter on the current state ( combination of position and velocity of the pupil point ) solved the flickering problem. The following articles explain the working of a Kalman filter in simpler ways,
How a kalman filter works on pictures
Computing facial normal
Now, the actual use of Dlib comes into play. Based on the 68 keypoints which Dlib gives for a detected face, I was able to get a measure of facial normal. You might wonder how this is possible? But believe me. Even I was surprised when I saw the output after implementing few research works and blog posts.
Estimating Gaze
Having localised pupil and estimating facial normal, the final task is to compute the gaze vector of the person. Doing some vector algebra with pupil location and facial normal, and fiddling with face geometry a decent logic was developed which returns gaze. You can refer the source code for more details on how I did it.
Output
After working for about a month and a half on this, I got good results on gaze computation. The facial normal and the gaze vector are predicted with a good accuracy. The gaze vector is sometimes erroneous when there are sudden fluctuations in the position of pupil. But after a lag of few frames it gets stabilized. Kalman filter is applied on gaze vector also, i.e the current state of the system includes gaze vector also 🙂
Sample output images are embedded below. White lines indicate gaze emerging from pupil and black lines indicate facial normal emerging from center of nose ( Thanks to Dlib 😉 )
The project source can be found here.