Enhancement of the coded speech using filtering
Taylan, Salih Sinan
MetadataTüm öğe kaydını göster
The processing and storage of speech signals are widely implemented in modern communication systems. Decreasing the amount of information for modeling the reconstruction of speech signal enhances the transmission and storage capacity of the system. It is important to compress speech without losing its important properties during transmission or reconstruction independently from the speaker and speech signals itself. However, some losses inevitably occur in every compression process. Increasing the compression ratio results in increased losses. Speech enhancement algorithms may be used to enhance strongly compressed speech signals for better intelligibility and quality. The purpose of this study is to enhance speech with healing algorithms that compress speech signals while reducing background noise. The SYMPES  algorithm used in this study compresses data resulting in lesser loss than other known compression algorithms. As a result of the compression, noise occurs in the background. The type of the noise cannot be classified. Attempts have been made to reduce these background noises (distortions) by using di_erent methods of speech enhancement algorithms. More than ten speech enhancement algorithms have been investigated and implemented. Two algorithms with the best-enhanced sound output were determined and compared. One of them, Spectral Subtraction Algorithm, was applied via a geometric approach, which was investigated in 2008 by Yang Lu and Philipos C. Loizou .In this algorithm, a noise spectrum is subtracted from the noisy speech signal and then a clean signal spectrum is obtained. Moreover, in the absence of the signal, the noise spectrum can be updated and predicted. This approach expressed that the noise spectrum is not signi_cantly di_erent between update periods and is a noisy cum stationary or slowly changing process. Forward and inverse Fourier transforms are used in the algorithm; hence, the algorithm is quite simple. However, the simple subtraction algorithm is a costly operation. Subtraction must be done with extreme caution to avoid any speech distortion. If too many subtractions are made, some speech information may be removed from the center; if too little is subtracted, it can be observed that a clear majority of the intervening noises are still present. The other speech enhancement method is a statistical model based algorithm. This statistical speech enhancement method involves predicting the statistic of a clean and noisy signal for a sample. In other words, if a speech signal is distorted with a statistically independent noise, the marginal probability distributions of the clean speech and noise signal must be clearly known. In this model-based statistical method, signal and noise statistics are estimated primarily from the speech and noise content. An optimal solution is obtained using statistical models and it is then used in conjunction with distortion measures to solve the existing speech enhancement problem. In this approach, di_erent techniques have been applied to parameterize speech signals such as autoregressive moving average (ARMA), autoregressive (AR), or moving average (MA). Three prediction rules known as the maximum probability (ML), maximum posterior (MAP), and minimum mean square error (MMSE) are used in this approach and have many desirable features to estimate the parameters of the speech signal. ML is used for the maintenance of non-random parameters. The estimation methods MAP and MMSE are used for known parameters of the previously known density function, which can be examined in advance as a random variable. For the speech signal, this model uses the MAP estimation approach, assuming a time-varying AR model for speech enhancement in which both the model and signal are estimated from the noisy signal. However, since the waveform of the speech signal is distorted as a result of the signal improvement, the SNR results are not found very healthy. Therefore, the results are evaluated by the Mean Opinion Score (MOS) test. A subjective test based on MOS is also carried out on some selected utterances. The results of the subjective test are also compared with those of the objective test to determine the most appropriate objective measure for the evaluation of speech enhancement algorithms. The strengths and weaknesses of the various algorithms are analyzed and compared. Quality has been shown in detailed graphs that can be measured and smoothed using the MOS, which de_nes the quality of speech by a listener on a scale of 1 to 5.