In the context of 2D face biometrics, pose represents one challenging intra-class variation that, over the years, has been approached in a number of different ways, such as through inherently pose-invariant features, or by means of specifically trained deep networks, or even by frontalization techniques aimed to restore a canonical pose. To this regard, the idea behind this paper is to perform head pose estimation (HPE) previous to the face recognition step, so that the probe’s resulting yaw, pitch and roll values can be used to possibly find the pose-wise closest elements in the gallery, thus reducing one-to-many matching time. We exploit a common approach to both pose estimation and feature matching, where fractal encoding is used to extract a feature vector and to compare it to a reference template through a metric distance for both HPE and the facial recognition. Experiments conducted on the Biwi Kinect Head Pose database and the AFLW2000 dataset show state-of-the art level of precision for head pose estimation along with a 10x efficiency boost in the overall face recognition task, compared to popular deep learning approaches, whilst achieving also an edge in terms of recognition accuracy.

PIFS Scheme for HEad pose Estimation aimed at faster Face recognition

Ricciardi, Stefano
2021-01-01

Abstract

In the context of 2D face biometrics, pose represents one challenging intra-class variation that, over the years, has been approached in a number of different ways, such as through inherently pose-invariant features, or by means of specifically trained deep networks, or even by frontalization techniques aimed to restore a canonical pose. To this regard, the idea behind this paper is to perform head pose estimation (HPE) previous to the face recognition step, so that the probe’s resulting yaw, pitch and roll values can be used to possibly find the pose-wise closest elements in the gallery, thus reducing one-to-many matching time. We exploit a common approach to both pose estimation and feature matching, where fractal encoding is used to extract a feature vector and to compare it to a reference template through a metric distance for both HPE and the facial recognition. Experiments conducted on the Biwi Kinect Head Pose database and the AFLW2000 dataset show state-of-the art level of precision for head pose estimation along with a 10x efficiency boost in the overall face recognition task, compared to popular deep learning approaches, whilst achieving also an edge in terms of recognition accuracy.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11695/105939
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact