Build Your Own Face Recognition Tool With Python

Do you have a phone that you can unlock with your face? Have you ever wondered how that works? Have you ever wanted to build your own face recognizer? With Python, some data, and a few helper packages, you can create your very own. In this project, you’ll use face detection and face recognition to identify faces in a given image.

In this tutorial, you’ll build your own face recognition tool using:

Face detection to find faces in an image
Machine learning to power face recognition for given images
Command-line arguments to direct your application with argparse
Bounding boxes to label faces with the help of Pillow

With these techniques, you’ll gain a solid foundation in computer vision. After implementing your knowledge in this project, you’ll be ready to apply these techniques in solving real-world problems beyond face recognition.

Click the link below to download the complete source code for this project:

Free Bonus: Click here to download the full source code to build your own face recognition app with Python.

Demo

When you’re done with this project, you’ll have a face recognition application that you can train on any set of images. Once it’s trained, you’ll be able to give your application a new image, and the app will draw boxes on any faces that it finds and label each face by name:

In this video, you saw this project in action: training a new model on a list of images, validating it against an image with known faces, and then testing it with a brand-new image. After finishing this tutorial, you’ll have your very own application that works just like this.

Remove ads

Project Overview

Your program will be a typical command-line application, but it’ll offer some impressive capabilities. To accomplish this feat, you’ll first use face detection, or the ability to find faces in an image. Then, you’ll implement face recognition, which is the ability to identify detected faces in an image. To that end, your program will do three primary tasks:

Train a new face recognition model.
Validate the model.
Test the model.

When training, your face recognizer will need to open and read many image files. It’ll also need to know who appears in each one. To accomplish this, you’ll set up a directory structure to give your program information about the data. Specifically, your project directory will contain three data directories:

training/
validation/
output/

You can put images directly into validation/. For training/, you should have images separated by subject into directories with the subject’s name.

Setting your training directory up this way will allow you to give your face recognizer the information that it needs to associate a label—the person pictured—with the underlying image data.

Note: This strategy works well for training on images that contain a single face. If you want to train on images with multiple identifiable faces, then you’ll have to investigate an alternative strategy for marking the faces in the training images.

You’ll walk through this project step by step, starting with preparing your environment and data. After that, you’ll be ready to load your training data and get to work on training your model to recognize unlabeled faces.

Once your app is able to do that, you’ll need a way to display your results. You’ll build a command-line interface so that users can interact with your app.

Finally, you’ll run the app through all of its paces. This is of vital importance because it’ll help you see your application through the eyes of a user. That way, you can better understand how your application works in practice, a process that’s key to finding bugs.

Prerequisites

To build this face recognition application, you won’t need advanced linear algebra, deep machine learning algorithm knowledge, or even any experience with OpenCV, one of the leading Python libraries enabling a lot of computer vision work.

Instead, you should have an intermediate-level understanding of Python. You should be comfortable with:

Installing third-party modules with pip
Using argparse to create a command-line interface
Opening and reading files with pathlib
Serializing and deserializing Python objects with pickle

With these skills in hand, you’ll be more than ready to start on step one of this project: preparing your environment and data.

Step 1: Prepare Your Environment and Data

In this step, you’ll create a project environment, install necessary dependencies, and set the stage for your application.

First, create your project and data directories:

PS> mkdir face_recognizer
PS> cd face_recognizer
PS> mkdir output
PS> mkdir training
PS> mkdir validation

$ mkdir face_recognizer
$ cd face_recognizer
$ mkdir output training validation

Running these commands creates a directory called face_recognizer/, moves to it, then creates the folders output/, training/, and validation/, which you’ll use throughout the project. Now you can create a virtual environment using the tool of your choice.

Before you start installing this project’s dependencies with pip, you’ll need to ensure that you have CMake and a C compiler like gcc installed on your system. If your system doesn’t already have them installed, then follow these instructions to get started:

To install CMake on Windows, visit the CMake downloads page and install the appropriate installer for your system.

You can’t get gcc as a stand-alone download for Windows, but you can install it as a part of the MinGW runtime environment through the Chocolatey package manager with the following command:

PS> choco install mingw

To install CMake on Linux, visit the CMake downloads page and install the appropriate installer for your system. Alternatively, CMake binaries may also be available through your favorite package manager. If you use apt package management, for example, then you can install CMake with this:

$ sudo apt-get update
$ sudo apt-get install cmake

You’ll also install gcc through your package manager. To install gcc with apt, you’ll install the build-essential metapackage:

$ sudo apt-get install build-essential

To verify that you’ve successfully installed gcc, you can check the version:

$ gcc --version

If this returns a version number, then you’re good to go!

To install CMake on macOS, visit the CMake downloads page and install the appropriate installer for your system. If you have Homebrew installed, then you can install both CMake and gcc that way:

$ brew update
$ brew install cmake gcc

After following these steps for your operating system, you’ll have Cmake and gcc installed and ready to assist you in building your project.

Now open your favorite text editor to create your requirements.txt file:

dlib==19.24.0
face-recognition==1.3.0
numpy==1.24.2
Pillow==9.4.0

This tells pip which dependencies your project will be using and pins them to these specific versions. This is important because future versions could have changes to their APIs that break your code. When you specify the versions needed, you have full control over what versions are compatible with your project.

Note: This project was built on Python 3.9 and also tested on 3.10. Because some of the packages used in this tutorial still use the legacy setup.py installation method, you may run into issues if you use 3.11.

After creating the requirements file and activating your virtual environment, you can install all of your dependencies at once:

(venv) $ python -m pip install -r requirements.txt

This command calls pip and tells it to install the dependencies in the requirements.txt file that you just created.

Next, you’ll need to find a dataset for training and validating your data. Celebrity images are a popular choice for testing face recognition because so many celebrity headshots are widely available. That’s the approach that you’ll take in this tutorial.

If you haven’t already, you can download everything you need for data training and validation by clicking the link below:

Free Bonus: Click here to download the full source code to build your own face recognition app with Python.

As an alternative, it can be great practice to set up your own dataset and folder structure. If you’d like to give that a try, then you can use this dataset or pictures of your own.

If your dataset isn’t already split into training and validation sets, then you should go ahead and make that split now.

In the training/ directory, you should create a separate folder for each person who appears in your training images. Then you can put all the images into their appropriate folders:

face_recognizer/
│
├── output/
│
├── training/
│   └── ben_affleck/
│       ├── img_1.jpg
│       └── img_2.png
│
├── validation/
│   ├── ben_affleck1.jpg
│   └── michael_jordan1.jpg
│
├── detector.py
├── requirements.txt
└── unknown.jpg

You can place the validation images directly into the validation/ directory. Your validation images need to be images that you don’t train with, but you can identify the people who appear in them.

In this step, you’ve prepared your environment. First, you created a directory and several subdirectories to house your project and its data.

Then you created a virtual environment, installed some dependencies manually, and then created a requirements.txt file with your project dependencies pinned to a specific version.

With that, you used pip to install your project dependencies. Then, you downloaded a dataset and split it into training and validation sets. Next, you’ll write the code to load the data and train your model.

Remove ads

Step 2: Load Training Data and Train Your Model

In this step, you’ll start writing code. This code will load your training data and start training your model. By the end of this step, you’ll have loaded your training data, detected faces in each image, and saved them as encodings.

First, you’ll need to load images from training/ and train your model on them. To do that, open your favorite editor, create a file called detector.py, and start writing some code:

# detector.py

from pathlib import Path

import face_recognition

DEFAULT_ENCODINGS_PATH = Path("output/encodings.pkl")

Path("training").mkdir(exist_ok=True)
Path("output").mkdir(exist_ok=True)
Path("validation").mkdir(exist_ok=True)

def encode_known_faces(
    model: str = "hog", encodings_location: Path = DEFAULT_ENCODINGS_PATH
) -> None:
    names = []
    encodings = []
    for filepath in Path("training").glob("*/*"):
        name = filepath.parent.name
        image = face_recognition.load_image_file(filepath)

You start your script by importing pathlib.Path from Python’s standard library, along with face_recognition, a third-party library that you installed in the previous step.

Then, you define a constant for the default encoding path. Keeping this path as a constant toward the top of your script will help you down the line if you want to change that path.

Next, you add three calls to .mkdir() and set exist_ok to True. You may not need these lines of code if you already created the three directories in the previous step. However, for convenience, this code automatically creates all the directories that you’ll use if they don’t already exist.

Finally, you define encode_known_faces(). This function uses a for loop to go through each directory within training/, saves the label from each directory into name, then uses the load_image_file() function from face_recognition to load each image.

As input, encode_known_faces() will require a model type and a location to save the encodings that you’ll generate for each image.

Note: You’re not using the required arguments yet, but you’ll add more code to encode_known_faces() that relies on these arguments in just a moment.

The model determines what you’ll use to locate faces in the input images. Valid model type choices are "hog" and "cnn", which refer to the respective algorithms used:

HOG (histogram of oriented gradients) is a common technique for object detection. For this tutorial, you only need to remember that it works best with a CPU.
CNN (convolutional neural network) is another technique for object detection. In contrast to a HOG, a CNN works better on a GPU, otherwise known as a video card.

These algorithms don’t rely on deep learning. If you’d like to learn more about how algorithms like these work under the hood, then Traditional Face Detection With Python is your guide.

Next, you’ll use face_recognition to detect the face in each image and get its encoding. This is an array of numbers describing the features of the face, and it’s used with the main model underlying face_recognition to reduce training time while improving the accuracy of a large model. This is known as transfer learning.

Then, you’ll add all the names and encodings to the lists names and encodings, respectively:

 1# detector.py
 2
 3# ...
 4
 5def encode_known_faces(
 6    model: str = "hog", encodings_location: Path = DEFAULT_ENCODINGS_PATH
 7) -> None:
 8    names = []
 9    encodings = []
10
11    for filepath in Path("training").glob("*/*"):
12        name = filepath.parent.name
13        image = face_recognition.load_image_file(filepath)
14
15        face_locations = face_recognition.face_locations(image, model=model)
16        face_encodings = face_recognition.face_encodings(image, face_locations)
17
18        for encoding in face_encodings:
19            names.append(name)
20            encodings.append(encoding)

After updating your project with this code, your encode_known_faces() function is ready to collect names and encodings from all the files in your training/ directory:

Line 15 uses face_recognition.face_locations() to detect the locations of faces in each image. The function returns a list of four-element tuples, one tuple for each detected face. The four elements per tuple provide the four coordinates of a box that could surround the detected face. Such a box is also known as a bounding box.
Line 16 uses face_recognition.face_encodings() to generate encodings for the detected faces in an image. Remember that an encoding is a numeric representation of facial features that’s used to match similar faces by their features.
Lines 18 to 20 add the names and their encodings to separate lists.

Now you’ve generated encodings and added them, along with the label for each image, to a list. Next, you’ll combine them into a single dictionary and save that dictionary to disk.

Note: You’re saving your encodings to disk because generating them can be time-consuming, especially if you don’t have a dedicated GPU. Once they’re generated, saving them allows you to reuse the encodings in other parts of your code without re-creating them every time.

Import pickle from the standard library and use it to save the name-encoding dictionary:

# detector.py

# ...

import pickle

# ...

def encode_known_faces(
    model: str = "hog", encodings_location: Path = DEFAULT_ENCODINGS_PATH
) -> None:
    names = []
    encodings = []

    for filepath in Path("training").glob("*/*"):
        name = filepath.parent.name
        image = face_recognition.load_image_file(filepath)

        face_locations = face_recognition.face_locations(image, model=model)
        face_encodings = face_recognition.face_encodings(image