Face detection is the process of automatically locating human faces in visual media (digital images or video). A face that is detected is reported at a position with an associated size and orientation. Once a face is detected, it can be searched for landmarks such as the eyes and nose.
Here are some of the terms that we use in discussing face detection and the various functionalities of the Mobile Vision API.
Face recognition automatically determines if two faces are likely to correspond to the same person. Note that at this time, the Google Face API only provides functionality for face detection and not face recognition.
Face tracking extends face detection to video sequences. Any face appearing in a video for any length of time can be tracked. That is, faces that are detected in consecutive video frames can be identified as being the same person. Note that this is not a form of face recognition; this mechanism just makes inferences based on the position and motion of the face(s) in a video sequence.
A landmark is a point of interest within a face. The left eye, right eye, and nose base are all examples of landmarks. The Face API provides the ability to find landmarks on a detected face.
Classification is determining whether a certain facial characteristic is present. For example, a face can be classified with regards to whether its eyes are open or closed. Another example is whether the face is smiling or not.
The face API detects faces at a range of different angles, as illustrated below:
Fig. 1. Pose angle estimation. (a) The coordinate system with the image in the XY plane and the Z axis coming out of the figure. (b) Pose angle examples where yEuler Y, rEuler Z.
The Euler X, Euler Y, and Euler Z angles characterize a face’s orientation as shown in Fig. 1. The Face API provides measurement of Euler Y and Euler Z (but not Euler X) for detected faces.
The Euler Z angle of the face is always reported. The Euler Y angle is available only when using the “accurate” mode setting of the face detector (as opposed to the “fast” mode setting, which takes some shortcuts to make detection faster). The Euler X angle is currently not supported.
A landmark is a point of interest within a face. The left eye, right eye, and nose base are all examples of landmarks. The figure below shows some examples of landmarks:
Rather than first detecting landmarks and using the landmarks as a basis of detecting the whole face, the Face API detects the whole face independently of detailed landmark information. For this reason, landmark detection is an optional step that could be done after the face is detected. Landmark detection is not done by default, since it takes additional time to run. You can optionally specify that landmark detection should be done.
The following table summarizes all of the landmarks that can be detected, for an associated face Euler Y angle:
|Euler Y angle||detectable landmarks|
|< -36 degrees||left eye, left mouth, left ear, nose base, left cheek|
|-36 degrees to -12 degrees||left mouth, nose base, bottom mouth, right eye, left eye, left cheek, left ear tip|
|-12 degrees to 12 degrees||right eye, left eye, nose base, left cheek, right cheek, left mouth, right mouth, bottom mouth|
|12 degrees to 36 degrees||right mouth, nose base, bottom mouth, left eye, right eye, right cheek, right ear tip|
|> 36 degrees||right eye, right mouth, right ear, nose base, right cheek|
Each detected landmark includes its associated position in the image.
Classification determines whether a certain facial characteristic is present. The Android Face API currently supports two classifications: eyes open and smiling. The iOS Face API currently supports the smiling classification. Classification is expressed as a certainty value, indicating the confidence that the facial characteristic is present. For example, a value of 0.7 or more for the smiling classification indicates that it is likely that a person is smiling.
Both of these classifications rely upon landmark detection.
Also note that “eyes open” and “smiling” classification only works for frontal faces, that is, faces with a small Euler Y angle (at most about +/- 18 degrees).
Please read our face detection guides on iOS and Android: