I created an end-to-end facial feature detector and a facial filter app using OpenCV & custom CNN. OpenCV was used to detect faces. CNN was used to train facial feature positions. These were combined to create a facial filter app.
Keywords: Python, OpenCV, CNN, TensorFlow, Keras, Numpy, Pyplot
Related projects: Dog Breed Classifier, Emoji Mimicking Game Github: https://github.com/adrianyi/FacialFilter
Face & eye detection using OpenCV
OpenCV provides a fast face detection algorithm using Haar cascades. There are actually several other pre-trained cascades available.
Here is a GIF of me with my face and eyes tracked:
Here is a code snippet for face detection:
# You first have to turn the image into a gray-scale image. OpenCV also uses BGR color space. gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY) # Create cascade classifier face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml') # Detect the faces in image faces = face_cascade.detectMultiScale(gray, 1.1, 6)
Similarly for eyes:
eye_cascade = cv2.CascadeClassifier('haarcascade_eye.xml') faces = face_cascade.detectMultiScale(gray, 1.1, 6)
You can use this to do something like blur your face:
for x, y, w, h in faces: image[y:y+h,x:x+w] = cv2.blur(image[y:y+h,x:x+w], (50,50))
Facial feature detection using CNN
Here’s a dataset from Kaggle that has 7049 human labeled images with positions for 30 facial features. Here’s an example from that dataset:
We can use this to train a CNN to track those features! My other project, Emoji Mimicking Game, uses an API by Affectiva that basically does this. There are definitely much more complex ones, but the concept is the same and this is sufficient for our purposes.
Here is a very small vanilla-version CNN, with repeating convolution layer (with ReLu activation) & max pooling. It outputs a bottleneck feature vector of length 128.
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_1 (Conv2D) (None, 96, 96, 8) 80 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 32, 32, 8) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 32, 32, 16) 528 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 16, 16, 16) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 16, 16, 32) 2080 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 8, 8, 32) 0 _________________________________________________________________ conv2d_4 (Conv2D) (None, 8, 8, 64) 8256 _________________________________________________________________ max_pooling2d_4 (MaxPooling2 (None, 4, 4, 64) 0 _________________________________________________________________ conv2d_5 (Conv2D) (None, 4, 4, 128) 32896 _________________________________________________________________ global_average_pooling2d_1 (None, 128) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 128) 0 _________________________________________________________________ dense_1 (Dense) (None, 30) 3870 ================================================================= Total params: 47,710 Trainable params: 47,710 Non-trainable params: 0 _________________________________________________________________
Training this model on the dataset above gives the following learning curve. I ran it for 200 epochs, which only took a couple minutes (<1s/epoch). We can use more complicated model, but this is sufficient for our purposes here.
The loss is RMS with the positions normalized to [-1,1] for both x and y. Both training and validation losses are below 0.002, which is pretty good for such a simple model.
We can test this on my video feed again!
It has a little trouble with the reflections on my glasses, and it is slightly off when my head is tilted. However, it performs pretty well overall. What can we do with this? We already saw that you can try to interpret emotions.
Below is a facial filter using sunglasses!
Now you know how Snapchat implements their facial filters!
Thanks for reading! :smiley: