PSHF: Pitch-Scaled Harmonic Filter

PSHF home

Welcome

The Pitch-Scaled Harmonic Filter (PSHF) is designed to take as input a recorded speech wave file and a file containing an estimated f0 (pitch) track to produce two output wave files, corresponding to the periodic and aperiodic sources. These typically are related to the voiced (periodic) and unvoiced (aperiodic) parts of the speech signal. The algortihm applies a harmonic filter in the time-frequency domain, so it can separate these components even when they occur simultaneously, as in mixed-source speech or when voiced speech is corrupted by background noise. An example of the latter is given in Fig. 1.

The PSHF algorithm was originally conceived within the Nephthys project for analysis of mixed-source speech, particularly voiced fricatives (e.g., /z/ and /v/). It was developed and re-implemented in C to process large amounts of speech data for automatic speech recognition (ASR) research under the Columbo project. It is provided here free of charge for general use, including both academic research and commercial product development.

Updates

9/12/10: Version 3.13 released for general use.

Selected research papers

PJB Jackson, CH Shadle. "Pitch-scaled estimation of simultaneous voiced and turbulence-noise components in speech". IEEE Transactions on Speech and Audio Processing, 9 (7): 713-726, Oct 2001. [ bib | doi | abstract | preprint | eprint | full paper ]

PJB Jackson, DM Moreno, MJ Russell, J Hernando. "Covariation and weighting of harmonically decomposed streams for ASR". In Proceedings of EUROSPEECH 2003, 2321-2324, Geneva, Switzerland, Sept 2003. [ abstract | pdf | slides | ISCA archive ]

Related research projects

Nephthys project, turbulence noise in mixed-source speech production

Columbo project, harmonic decomposition applied to automatic speech recognition


Figure 1. Female speech "one-three-four-five": (left) noise-corrupted speech, (right) periodic component extracted by the PSHF.