


default search action
25th SPECOM 2023: Dharwad, India - Part I
- Alexey Karpov
, K. Samudravijaya
, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna
:
Speech and Computer - 25th International Conference, SPECOM 2023, Dharwad, India, November 29 - December 2, 2023, Proceedings, Part I. Lecture Notes in Computer Science 14338, Springer 2023, ISBN 978-3-031-48308-0
Automatic Speech Recognition
- Ivan Peralta
, Nanci Odetti
, Hugo Leonardo Rufiner
:
Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural Networks. 3-17 - Denis Ivanko
, Elena Ryumina
, Dmitry Ryumin
, Alexandr Axyonov
, Alexey M. Kashevnik
, Alexey Karpov
:
EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition. 18-31 - Tonmoy Rajkhowa, Amartya Chowdhury, S. R. Mahadeva Prasanna:
Significance of Audio Quality in Speech-to-Text Translation Systems. 32-42 - Tatiana Y. Sherstinova
, Rostislav Kolobov
, Nikolay Mikhaylovskiy
:
Everyday Conversations: A Comparative Study of Expert Transcriptions and ASR Outputs at a Lexical Level. 43-56 - Raj Gothi, Preeti Rao:
Improving Automatic Speech Recognition with Dialect-Specific Language Models. 57-67 - Liudmila Bukreeva
, Daria Guseva
, Mikhail Dolgushin
, Vera Evdokimova
, Vasilisa Obotnina
:
Emotional Speech Recognition of Holocaust Survivors with Deep Neural Network Models for Russian Language. 68-76
Computational Paralinguistics
- Mercedes Vetráb
, Gábor Gosztolya
:
Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic Tasks. 79-93 - Kumar Kaustubh, Parismita Gogoi, S. R. M. Prasanna:
Rhythm Formant Analysis for Automatic Depression Classification. 94-106 - Pavel Laptev
, Sergey Litovkin
, Evgeny Kostyuchenko
:
Determining Alcohol Intoxication Based on Speech and Neural Networks. 107-115 - Baveet Singh Hora, S. Uthiraa, Hemant A. Patil:
Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition. 116-129 - Vamshi Raghu Simha Narasinga
, Mirishkar Sai Ganesh
, Anil Kumar Vuppala
:
Enhancing Stutter Detection in Speech Using Zero Time Windowing Cepstral Coefficients and Phase Information. 130-141 - Rishith Sadashiv T. N., Devesh Kumar, Ayush Agarwal, Moakala Tzudir, Jagabandhu Mishra, S. R. Mahadeva Prasanna:
Source and System-Based Modulation Approach for Fake Speech Detection. 142-155
Digital Signal Processing
- Sergey Novoselov, Galina Lavrentyeva, Vladimir Volokhov, Marina Volkova, Nikita Khmelev, Artem Akulov:
Investigation of Different Calibration Methods for Deep Speaker Embedding Based Verification Systems. 159-168 - Punnoose Kuriakose:
Learning to Predict Speech Intelligibility from Speech Distortions. 169-176 - Akansha Tyagi, Padmanabhan Rajan:
Sparse Representation Frameworks for Acoustic Scene Classification. 177-188 - Mrinmoy Bhattacharjee
, Shikha Baghel
, S. R. Mahadeva Prasanna
:
Driver Speech Detection in Real Driving Scenario. 189-199 - Kamini Sabu, Mukesh Sharma, Nitya Tiwari, M. Ali Basha Shaik:
Regularization Based Incremental Learning in TCNN for Robust Speech Enhancement Targeting Effective Human Machine Interaction. 200-209 - Meghna Pandharipande, Sunil Kumar Kopparapu
:
Candidate Speech Extraction from Multi-speaker Single-Channel Audio Interviews. 210-221 - Lalaram Arya, S. R. Mahadeva Prasanna:
Post-processing of Translated Speech by Pole Modification and Residual Enhancement to Improve Perceptual Quality. 222-232 - MD. Tousin Akhter, Padmanabha Banerjee, Sandipan Dhar, Subhayu Ghosh, Nanda Dulal Jana:
Region Normalized Capsule Network Based Generative Adversarial Network for Non-parallel Voice Conversion. 233-244 - Anuj Patel, G. Satya Prasad, Sabyasachi Chandra, Puja Bharati, Shyamal Kumar Das Mandal:
Speech Enhancement Using LinkNet Architecture. 245-257 - Aniket Aitawade
, Puja Bharati
, Sabyasachi Chandra
, G. Satya Prasad, Debolina Pramanik
, Parth Sanjay Khadse
, Shyamal Kumar Das Mandal
:
ATT:Adversarial Trained Transformer for Speech Enhancement. 258-270 - Daniyar Wolf
, Yaroslav Turovsky
, Roman V. Meshcheryakov
, Anastasia Iskhakova
:
Human Identification by Dynamics of Changes in Brain Frequencies Using Artificial Neural Networks. 271-284
Speech Prosody
- Rodmonga Potapova
, Vsevolod Potapov
, Irina Kuryanova
:
Analysis of Formant Trajectories of a Speech Signal for the Purpose of Forensic Identification of a Foreign Speaker. 287-300 - Polina Vasileva
, Uliana E. Kochetkova
, Pavel A. Skrelin
:
Gestures vs. Prosodic Structure in Laboratory Ironic Speech. 301-313 - Priyankoo Sarmah
, Wendy Lalhminghlui
, Neeraj Kumar Sharma
:
Sounds of ence: Acoustics of Inhalation in Read Speech. 314-321 - Natalia Bogdanova-Beglarian
, Kristina Zaides
, Daria Stoika
, Xiaoli Sun
:
Prolongations as Hesitation Phenomena in Spoken Speech in First and Second Language. 322-338 - Priyanshi Pal, Shelly Jain, Chiranjeevi Yarra, Prasanta Kumar Ghosh, Anil Kumar Vupalla:
Study of Indian English Pronunciation Variabilities Relative to Received Pronunciation. 339-349 - Olga Iriskhanova
, Maria Kiose
, Anna Leonteva
, Olga Agafonova
, Andrey Petrov
:
Multimodal Collaboration in Expository Discourse: Verbal and Nonverbal Moves Alignment. 350-363 - Arup Saha
, Tulika Basu
, Bhaskar Gupta:
Association of Time Domain Features with Oral Cavity Configuration During Vowel Production and Its Application in Vowel Recognition. 364-379 - Anastasia Gorbyleva:
Prosodic Interaction Models in a Conversation. 380-388
Natural Language Processing
- Kirill Apanasovich
, Olesia Makhnytkina
, Yuri Matveev
:
Development and Research of Dialogue Agents with Long-Term Memory and Web Search. 391-401 - Liliya Komalova
:
Pre- and Post-Textual Contexts in Assessment of a Message as Offensive or Defensive Aggression Verbalization. 402-414 - Krishnendu Ghosh
, Sandipan Mandal, Nilay Roy:
Boosting Rule-Based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in Bengali. 415-429 - Alexandra Vahrusheva
, Valery D. Solovyev
, Marina Solnyshkina
, Elzara Gafiyatova
, Svetlana Akhtyamova
:
Revisiting Assessment of Text Complexity: Lexical and Syntactic Parameters Fluctuations. 430-441 - Snehal Ranjan, Sai Kalyan Nanduri, Prakul Virdi, Chiranjeevi Yarra:
Analysis of Natural Language Understanding Systems with L2 Learner Specific Synthetic Grammatical Errors Based on Parts-of-Speech. 442-454 - Maria Khokhlova
, Olga Blinova
, Natalia Bogdanova-Beglarian
, Tatiana Y. Sherstinova
:
On the Most Frequent Sequences of Words in Russian Spoken Everyday Language (Bigrams and Trigrams): An Experience of Classification. 455-466
Child Speech Processing
- Elena E. Lyakso
, Olga V. Frolova
, Aleksandr Nikolaev
, Egor Kleshnev
, Platon Grave
, Abylay Ilyas
, Olesia Makhnytkina
, Ruban Nersisson
, A. Mary Mekala
, M. Varalakshmi
:
Recognition of the Emotional State of Children by Video and Audio Modalities by Indian and Russian Experts. 469-482 - Udara Laxman Kumar, Mikko Kurimo, Hemant Kumar Kathania:
Effect of Linear Prediction Order to Modify Formant Locations for Children Speech Recognition. 483-493 - Syed Shahnawazuddin
, Ankita
, Avinash Kumar, Hemant Kumar Kathania
:
Gammatone-Filterbank Based Pitch-Normalized Cepstral Coefficients for Zero-Resource Children's ASR. 494-505 - Soma Khan, Tulika Basu, Joyanta Basu
, Madhab Pal, Rajib Roy:
System Assisted Vocal Response Analysis and Assessment of Autism in Children: A Machine Learning Based Approach. 506-519 - Jayant Kumar Rout, Gayadhar Pradhan:
Addressing Effects of Formant Dispersion and Pitch Sensitivity for the Development of Children's KWS System. 520-534 - Elena E. Lyakso
, Olga V. Frolova
, Aleksandr Nikolaev
, Severin Grechanyi
, Anton Matveev
, Yuri Matveev
, Olesia Makhnytkina
, Ruban Nersisson
:
Emotional State of Children with ASD and Intellectual Disabilities: Perceptual Experiment and Automatic Recognition by Video, Audio and Text Modalities. 535-549 - S. Uthiraa, Aastha Kachhi, Hemant A. Patil:
Linear Frequency Residual Features for Infant Cry Classification. 550-561
Speech Processing for Medicine
- Sharal Coelho, Hosahalli Lakshmaiah Shashirekha:
Identification of Voice Disorders: A Comparative Study of Machine Learning Algorithms. 565-578 - Siddharth Rathod
, Monil Charola
, Hemant A. Patil
:
Transfer Learning Using Whisper for Dysarthric Automatic Speech Recognition. 579-589 - Oindrila Banerjee
, D. Govind, Suryakanth V. Gangashetty, Akhilesh Kumar Dubey, Rajeev Aravindakshan, Sasikumar Panicker, K. Reshma:
Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury. 590-600 - Dariya Novokhrestova
, Evgeny Kostyuchenko
, Ilya Krivoshein
, Lidiya N. Balatskaya
:
Speech Signal Segmentation into Silence, Unvoiced and Vocalized Sections in Speech Rehabilitation. 601-610 - Chandra Mohan Bhuma
:
Respiratory Sickness Detection from Audio Recordings Using CLIP Models. 611-625 - Rohan Kumar Gupta
, Rohit Sinha
:
Investigating the Effect of Data Impurity on the Detection Performances of Mental Disorders Through Spoken Dialogues. 626-637

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.