Head pose estimation (HPE) represents a topic central to many relevant research fields and characterized by a wide application range. In particular, HPE performed using a singular RGB frame is particular suitable to be applied at best-frame-selection problems. This explains a growing interest witnessed by a large number of contributions, most of which exploit deep learning architectures and require extensive training sessions to achieve accuracy and robustness in estimating head rotations on three axes. However, methods alternative to machine learning approaches could be capable of similar if not better performance. To this regard, we present FASHE, an approach based on partitioned iterated function systems (PIFS) to represent auto-similarities within face image through a contractive affine function transforming the domain blocks extracted only once by a single frontal reference image, in a good approximation of the range blocks which the target image has been partitioned into. Pose estimation is achieved by finding the closest match between fractal code of target image and a reference array by means of Hamming distance. The results of experiments conducted exceed the state of the art on both Biwi and Ponting'04 datasets as well as approaching those of the best performing methods on the challenging AFLW2000 database. In addition, the applications to GOTCHA Video Dataset demonstrate that FASHE successfully operates in-the-wild.

FASHE: A FrActal Based Strategy for Head Pose Estimation

Ricciardi S.
2021-01-01

Abstract

Head pose estimation (HPE) represents a topic central to many relevant research fields and characterized by a wide application range. In particular, HPE performed using a singular RGB frame is particular suitable to be applied at best-frame-selection problems. This explains a growing interest witnessed by a large number of contributions, most of which exploit deep learning architectures and require extensive training sessions to achieve accuracy and robustness in estimating head rotations on three axes. However, methods alternative to machine learning approaches could be capable of similar if not better performance. To this regard, we present FASHE, an approach based on partitioned iterated function systems (PIFS) to represent auto-similarities within face image through a contractive affine function transforming the domain blocks extracted only once by a single frontal reference image, in a good approximation of the range blocks which the target image has been partitioned into. Pose estimation is achieved by finding the closest match between fractal code of target image and a reference array by means of Hamming distance. The results of experiments conducted exceed the state of the art on both Biwi and Ponting'04 datasets as well as approaching those of the best performing methods on the challenging AFLW2000 database. In addition, the applications to GOTCHA Video Dataset demonstrate that FASHE successfully operates in-the-wild.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11695/105921
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 17
  • ???jsp.display-item.citation.isi??? ND
social impact