This paper describes a Speaker Veriﬁcation System based on the use of multi resolution classiﬁers in order to cope with performance degradation due to natural variations of the excitation source and of the vocal tract. The different resolution representations of the speaker are obtained by considering multiple frame lengths in the feature extraction process and from these representations a single Pseudo-Multi Parallel Branch (P-MPB) Hidden Markov Model is obtained. In the veriﬁcation process, different resolution representations of the speech signal are classiﬁed by multiple P-MPB systems: the ﬁnal decision is obtained by means of different combination techniques. The system based on the Weighted Majority Vote technique considerably outperforms baseline systems: improvements are between 15% and 38%. The execution time of the veriﬁcation process is also evaluated and it proves to be very acceptable, thus allowing the use of the approach for applications in real time systems.
|Digital Object Identifier (DOI):||10.1111/j.1468-0394.2011.00603.x|
|Codice identificativo ISI:||WOS:000310732300003|
|Codice identificativo Scopus:||2-s2.0-84868528768|
|Appare nelle tipologie:||1.1 Articolo in rivista|