A Comparative Study on MFCC, GFCC, BFCC, and CQCC Spectral Speech Feature Performance in X-Vector Clustering

Abueg, Abelson

A Comparative Study on MFCC, GFCC, BFCC, and CQCC Spectral Speech Feature Performance in X-Vector Clustering

Files

Primary ABUEG-MASTERSTHESIS-2023.pdf (1.74 MB)

Date

2023-07-25

Authors

Abueg, Abelson

Publisher

East Carolina University

Abstract

Speaker diarization plays a crucial role in accurately identifying speakers in audio or video streams with multiple speakers. However, the use of Mel-frequency cepstral coefficients (MFCC) as the default speaker feature has posed a significant limitation in speech processing research. Existing literature suggests a lack of research addressing this limitation. This thesis aims to fill this gap by exploring alternative speech features and conducting a comprehensive investigation of their performance in the clustering step of speaker diarization. By conducting a comparative analysis of various spectral features, including Gammatone Frequency Cepstral Coefficients (GFCC), Constant-Q Cepstral Coefficients (CQCC), and Bark Frequency Cepstral Coefficients (BFCC), this study trains four distinct x-vector embedding deep neural networks (DNNs) and evaluates their effectiveness using four clustering algorithms. The results highlight the potential of the investigated alternative spectral features to outperform MFCC, emphasizing the need to move beyond the default MFCC approach and encouraging further exploration of alternative speech features for enhancing speaker diarization and related speech-processing tasks.

Keywords

spectral feature comparison, x-vector, DNN, clustering

URI

http://hdl.handle.net/10342/13177

Collections

Master's Theses

Full item page

A Comparative Study on MFCC, GFCC, BFCC, and CQCC Spectral Speech Feature Performance in X-Vector Clustering

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

DOI

Collections

Endorsement

Review

Supplemented By

Referenced By