Peculiar features of using the stimulus material for creating the emotionally coloured speech database
Authors: Horava A.V. | |
Published in issue: #6(23)/2018 | |
DOI: 10.18698/2541-8009-2018-6-334 | |
Category: Medical sciences | Chapter: Medical equipment and devices |
|
Keywords: basic emotions, signal characteristics, speech signal intensity, base frequency, first formant frequencies, tempo of speech |
|
Published: 21.06.2018 |
The sounding speech is one of the kinds of signals used by the human brain for analyzing the emotional state of a person. Emotion recognition from speech with the aid of computer systems is currently an actively developing line of research. Emotion recognition from speech algorithm output is mainly determined by the base used for teaching algorithms. At present there is no public database of the emotionally coloured Russian speech. In this paper we try to remove the specified shortcoming. The article describes the stimulus material for inducing the speaker’s emotions. We provide the parameters of separate stimuli (text and video recording), used in the process of forming the base.
References
[1] Boyko A.A., Neverova E.S., Karankevich A.I., Spiridonov I.N. Issledovanie neverbal’nogo povedeniya studentov pri sdache ekzamenov [Research on students nonverbal behavior at the exam]. Nauka i inzhenernoe obrazovanie. SEE-2016 [Science and engineering education. SEE–2016]. Moscow, 2016, pp. 162–163.
[2] Pilipenko M.N., Latysheva E.Yu., Boyko A.A., Spiridonov I.N. Research of algorithms for action units’ automatic detection using facial image. Biotekhnosfera, 2016, no. 6(48), pp. 8–12.
[3] Kipyatkova I.S., Karpov A.A. An analytical survey of large vocabulary Russian speech recognition systems. Trudy SPIIRAN [SPIIRAS Proceedings], 2010, no. 1(12), pp. 7–20.
[4] Sterling G.G., Prikhod’ko P.V. Glubokoe obuchenie v zadache raspoznavaniya emotsiy iz rechi [Deep learning in problem of emotion recognition from speech]. Sb. tr. 40 mezhdistsiplinarnoy shkoly-konf. “Informatsionnye tekhnologii i sistemy 2016” [Proc. 40th Interdisciplinary Conf. & School “Information Technology and Systems 2016”]. Moscow, IITP RAS, 2016, pp. 451–456.
[5] Tsentr rechevykh tekhnologiy [Center of speech technologies]. Available at: http://www.speechpro.ru/ (accessed 19 November 2017).
[6] Aleshin T.S., Red’ko A.Yu. Bases of speech data corpus preparation for the emotional speech recognition. Sovremennye naukoemkie tekhnologii [Modern high technologies], 2016, no. 6-2, pp. 229–234.
[7] Burkhardt F., Paeschke A., Rolfes M., Sendlmeier W., Weiss B. A database of German emotional speech. Proc. Interspeech, 2005, pp. 1517–1520.
[8] Davydov A.G., Kiselev V.V., Kochetkov D.S. Klassifikatsiya emotsional’nogo sostoyaniya diktora po golosu: problemy i resheniya [Classification of speaker emotional state by voice: problems and solutions]. Tr. mezhd. konf. “Dialog – 2011” [Proc. Int. conf. “Dialogue-2011”]. Moscow, RGTU, 2011, pp. 178–185.
[9] Izard C.E. The psychology of emotions. Springer Science & Business Media, 1991, 452 p. (Russ. ed.: Psikhologiya emotsiy. Sankt-Petersburg, Piter publ., 2000, 464 p.)