Preview

Journal of Instrument Engineering

Advanced search

Approach to automatic recognition of emotions in speech transcriptions

https://doi.org/10.17586/0021-3454-2023-66-10-818-827

Abstract

The issue of recognizing emotions in speech transcriptions, which is relevant in various fields, is studied. The influence of preprocessing methods (stop word removal, lemmatization, stemming) on the accuracy of emotion recognition in text data in Russian and English is analyzed. To conduct experimental studies, orthographic transcriptions of dialogues from the multimodal corpora RAMAS and CMU-MOSEI in Russian and English, respectively, are used. These corpora are annotated for the following emotions: joy, surprise, fear, anger, sadness, disgust and neutral. Preprocessing of text data includes removal of punctuation marks and stop words, tokenization, lemmatization and stemming. Vectorization of the resulting material is carried out using the TF-IDF, BoW, Word2Vec methods. The used classifiers are support vector machines and logistic regression. An approach is developed that is a combination of the above methods. For the Russian language, the highest accuracy of emotion recognition achieved using a weighted F-measure is 92.63 %, for the English language – 47.21 %. In addition, studies are conducted to identify the number of remote stops for effective emotion recognition from text data. Experimental results show that storing stop words in the source text allows to achieve the highest accuracy of text classification.

About the Authors

A. A. Dvoynikova
St. Petersburg Federal Research Center of the RAS
Russian Federation

Anastasia A. Dvoynikova - Speech and Multimodal Interfaces Laboratory; Junior Researcher

St. Petersburg



K. K. Kondratenko
St. Petersburg State University
Russian Federation

Khrystyna O. Kondratenko - Bachelor

St. Petersburg



References

1. Acheampong F.A., Wenyu C., Nunoo-Mensah H. Engineering Reports, 2020, no. 7(2), pp. e12189, DOI: 10.1002/eng2.12189.

2. Dzedzickis A., Kaklauskas A., Bucinskas V. Sensors, 2020, no. 3(20), pp. 592, DOI: 10.3390/s20030592.

3. Ryumina E.V., Karpov A.A. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2020, no. 2(20), pp. 163–176, DOI: 10.17586/2226-1494-2020-20-2-163-176. (in Russ.)

4. Mubarakshina R.T., Yakovenko R.T. Sistemnyy analiz v proyektirovanii i upravlenii (System Analysis in Design and Management), 2019, no. 1(23), pp. 392–397. (in Russ.)

5. Bogdanov A.L., Dulya I.S. Tomsk State University Journal of Economics, 2019, no. 47, pp. 220–241, DOI: 10.17223/19988648/47/17. (in Russ.)

6. Dyulicheva Yu. Voprosy obrazovaniya (Educational Studies), 2021, no. 4, pp. 243–265, DOI: 10.17323/1814-9545-2021-4-243-265. (in Russ.)

7. Adoma A.F., Henry N.M., Chen W. 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), 2020. рр. 117–121, DOI: 10.1109/iccwamtip51612.2020.9317379.

8. Verkholyak O., Dvoynikova A., Karpov A. J. Internet Serv. Inf. Secur., 2021, no. 1(11), pp. 80–96.

9. Liu Y., Fu G. Future Generation Computer Systems, 2021, vol. 119, рр. 1–6, DOI: 10.1016/j.future.2021.01.010.

10. Ovsyannikova V.V. Financial Analytics: Science and Experience, 2013, no. 175(37), pp. 43–48. (in Russ.)

11. Ekman P. Handbook of cognition and emotion, 1999, pp. 45–60.

12. Izard C.E. The psychology of emotions, NY, London, Plenum Press, 1991.

13. Sogancioglu G., Verkholyak O., Kaya H., Fedotov D., Cadée T., Salah A. A., Karpov A. INTERSPEECH, 2020, рр. 2097–2101, DOI: 10.21437/interspeech.2020-3160.

14. Russell J.A. Psychological bulletin, 1991, no. 3(110), pp. 426–450, DOI: 10.1037/0033-2909.110.3.426.

15. Dvoynikova A.A., Karpov А.А. Information and Control Systems, 2020, no. 4(107), pp. 20–30, DOI:10.31799/1684-8853-2020-4-20-30. (in Russ.)

16. Henry E.R., Hofrichter J. Methods in enzymology, Academic Press, 1992, vol. 210, рр. 129–192, DOI: 10.1016/0076-6879(92)10010-B.

17. Pennington J., Socher R., Manning C.D. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, рр. 1532–1543, DOI: 10.3115/v1/d14-1162.

18. Bojanowski P., Grave E., Joulin A., Mikolov T. Transactions of the association for computational linguistics, 2017, vol. 5, рр. 135–146, DOI: 10.1162/tacl_a_00051.

19. Mikolov T., Sutskever I., Chen K., Corrado G.S., Dean J. Advances in neural information processing systems, 2013, vol. 26, рр. 1–9.

20. Devlin J., Chang M.W., Lee K., Toutanova K. arXiv preprint arXiv:1810.04805, 2018, DOI: 10.48550/arXiv.1810.04805.

21. Peters M., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., Zettle-moyer L. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018, vol. 1, рр. 2227–2237.

22. Halim L.R., Suryadibrata A. IJNMT (International Journal of New Media Technology), 2021, no. 1(8), pp. 57–64, DOI: 10.31937/ijnmt.v8i1.2047.

23. Duong H.T., Nguyen-Thi T.A. Computational Social Networks, 2021, no. 1(8), pp. 1–16, DOI: 10.1186/s40649-020-00080-x.

24. Perepelkina O., Kazimirova E., Konstantinova M. International Conference on Speech and Computer, Springer, Cham, 2018, рр. 501–510, DOI: 10.1007/978-3-319-99579-3_52.

25. Dvoynikova A.A., Verkholyak О.V., Karpov А.А. Almanac of Scientific Works of Young Scientists of ITMO University, 2020, vol. 3, рр. 75–80. (in Russ.)

26. Zadeh A.B., Liang P.P., Poria S., Cambria E., Morency L.P. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018, vol. 1, Long Papers, рр. 2236–2246, DOI: 10.18653/v1/p18-1208.


Review

For citations:


Dvoynikova A.A., Kondratenko K.K. Approach to automatic recognition of emotions in speech transcriptions. Journal of Instrument Engineering. 2023;66(10):818-827. (In Russ.) https://doi.org/10.17586/0021-3454-2023-66-10-818-827

Views: 16


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 0021-3454 (Print)
ISSN 2500-0381 (Online)