10th International Scientific Conference Technics, Informatics and Education – TIE 2024, str. 74-78

АУТОР(И) / AUTHOR(S): Branko R. Marković , Milan Vesković

Download Full Pdf  

DOI: 10.46793/TIE24.074M

САЖЕТАК /ABSTRACT:

This paper will present the results of speech recognition based on different Daubechies wavelet orders. Two speakers (one female and one male) were analyzed in two speech modes: normal and whisper. The patterns are used from the Whi-Spe database. As an input to the recognition system, the Daubechies wavelet feature vectors with different orders were used. As a back-end of the system, the standard Dynamic Time Warping method was considered. The results are given in the form of tables and histograms. They suggest which order of Daubechies is the most convenient for this kind of speech recognition

КЉУЧНЕ РЕЧИ / KEYWORDS: 

Speech recognition; Discrete Wavelet Transformation (DWT); Daubechies; Whi-Spe database; Dynamic Time Warping (DTW)

ПРОЈЕКАТ / ACKNOWLEDGEMENTS:

This study was supported by the Ministry of Science, Technological Development and Innovation of the Republic of Serbia, and these results are parts of Grant No. 451-03-66/2024-03/200132 with the University of Kragujevac – Faculty of Technical Sciences Čačak.

ЛИТЕРАТУРА / REFERENCES:

  1. Catford, J.C. (1977). Fundamental problems in phonetics, Edinburgh: Edinbourgh University Press.
  2. Ito, J.T., Takeda, K., Itakura, F. (2005). Analysis and Recognition of Whispered speech, Speech Communication, pp. 129-152.
  3. Jovičić, S.T. (1988). Formant feature differences between whispered and voiced sustained vowels, ACUSTICA – Acta Acustica, 84(4), 739-743.
  4. Jovičić, S.T., Šarić, Z.M. (2008). Acoustic analysis of consonants in whispered speech, Journal of Voice, 22(3), pp. 263-274.
  5. Marković, , Galić, J., Grozdić, Đ. . Jovičić, S. T. (2013). Application of DTW method for whispered speech recognition, Speech and Language 2013, 4th International Conference on Fundamental and Applied Aspects of Speech and Language, Belgrade, October 25-26, 2013
  6. Galić, , Jovičić, S.T., Grozdić Đ. and Marković, B. (2014). HTK-Based Recognition of Whispered Speech, A. Ronzhin et al. (Eds.): SPECOM 2014, LNAI 8773, Springer International Publishing Switzerland 2014, 251.
  7. Grozdić, Đ.T., Marković, B., Galić, J., Jovičić, S.T. (2013). Application of Neural Networks in Whispered Speech Recognition, TELFOR Journal, Vol. 5, No. 2, 2013,  103-106.
  8. Mallat, S. (2008). A Wavelet Tour of Signal Processing, Third Edition. Academic Press.
  9. Rioul, O., Vetterli, M. (1991). Wavelets and signal processing, IEEE Signal Processing Magazine, vol. 8, no. 4, pp. 14–38, Oct. 1991.
  10. Van Berkel, M. (2010). Wavelets for Feature Detection; Theoretical background, Eindhoven University of Technology, Department of Mechanical Engineering, Eindhoven, Literature study, Mar. 2010.
  11. Marković, B., Jovičić, S.T., Galić, J., Grozdić, Đ. (2013). Whispered Speech Database: Design, Processing and Application, 16th International Conference, TSD 2013, I. Habernal and V. Matousek (Eds.): TSD 2013, LNAI 8082, Springer-Verlag Berlin Heidelberg, 591-598.
  12. Sarikaya, R., Pellom, B. L. and Hansen, J. H. L. (1998). Wavelet packet transform features with application to speaker identification, in IEEE Nordic signal processing symposium, Denmark, 1998, pp. 81–84.
  13. Grozdić, Đ., Jovičić, S., Šumarac-Pavlović, D., Galić J., Marković, B. (2017). Comparison of Cepstral Normalization Techniques in Whispered Speech Recognition, Advances in Electrical and Computer Engineering, Vol. 17. Number 1, 2017, pp 21-26.