default search action
25th SPECOM 2023: Dharwad, India - Part I
- Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna:
Speech and Computer - 25th International Conference, SPECOM 2023, Dharwad, India, November 29 - December 2, 2023, Proceedings, Part I. Lecture Notes in Computer Science 14338, Springer 2023, ISBN 978-3-031-48308-0
Automatic Speech Recognition
- Ivan Peralta, Nanci Odetti, Hugo Leonardo Rufiner:
Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural Networks. 3-17 - Denis Ivanko, Elena Ryumina, Dmitry Ryumin, Alexandr Axyonov, Alexey M. Kashevnik, Alexey Karpov:
EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition. 18-31 - Tonmoy Rajkhowa, Amartya Chowdhury, S. R. Mahadeva Prasanna:
Significance of Audio Quality in Speech-to-Text Translation Systems. 32-42 - Tatiana Y. Sherstinova, Rostislav Kolobov, Nikolay Mikhaylovskiy:
Everyday Conversations: A Comparative Study of Expert Transcriptions and ASR Outputs at a Lexical Level. 43-56 - Raj Gothi, Preeti Rao:
Improving Automatic Speech Recognition with Dialect-Specific Language Models. 57-67 - Liudmila Bukreeva, Daria Guseva, Mikhail Dolgushin, Vera Evdokimova, Vasilisa Obotnina:
Emotional Speech Recognition of Holocaust Survivors with Deep Neural Network Models for Russian Language. 68-76
Computational Paralinguistics
- Mercedes Vetráb, Gábor Gosztolya:
Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic Tasks. 79-93 - Kumar Kaustubh, Parismita Gogoi, S. R. M. Prasanna:
Rhythm Formant Analysis for Automatic Depression Classification. 94-106 - Pavel Laptev, Sergey Litovkin, Evgeny Kostyuchenko:
Determining Alcohol Intoxication Based on Speech and Neural Networks. 107-115 - Baveet Singh Hora, S. Uthiraa, Hemant A. Patil:
Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition. 116-129 - Vamshi Raghu Simha Narasinga, Mirishkar Sai Ganesh, Anil Kumar Vuppala:
Enhancing Stutter Detection in Speech Using Zero Time Windowing Cepstral Coefficients and Phase Information. 130-141 - Rishith Sadashiv T. N., Devesh Kumar, Ayush Agarwal, Moakala Tzudir, Jagabandhu Mishra, S. R. Mahadeva Prasanna:
Source and System-Based Modulation Approach for Fake Speech Detection. 142-155
Digital Signal Processing
- Sergey Novoselov, Galina Lavrentyeva, Vladimir Volokhov, Marina Volkova, Nikita Khmelev, Artem Akulov:
Investigation of Different Calibration Methods for Deep Speaker Embedding Based Verification Systems. 159-168 - Punnoose Kuriakose:
Learning to Predict Speech Intelligibility from Speech Distortions. 169-176 - Akansha Tyagi, Padmanabhan Rajan:
Sparse Representation Frameworks for Acoustic Scene Classification. 177-188 - Mrinmoy Bhattacharjee, Shikha Baghel, S. R. Mahadeva Prasanna:
Driver Speech Detection in Real Driving Scenario. 189-199 - Kamini Sabu, Mukesh Sharma, Nitya Tiwari, M. Ali Basha Shaik:
Regularization Based Incremental Learning in TCNN for Robust Speech Enhancement Targeting Effective Human Machine Interaction. 200-209 - Meghna Pandharipande, Sunil Kumar Kopparapu:
Candidate Speech Extraction from Multi-speaker Single-Channel Audio Interviews. 210-221 - Lalaram Arya, S. R. Mahadeva Prasanna:
Post-processing of Translated Speech by Pole Modification and Residual Enhancement to Improve Perceptual Quality. 222-232 - MD. Tousin Akhter, Padmanabha Banerjee, Sandipan Dhar, Subhayu Ghosh, Nanda Dulal Jana:
Region Normalized Capsule Network Based Generative Adversarial Network for Non-parallel Voice Conversion. 233-244 - Anuj Patel, G. Satya Prasad, Sabyasachi Chandra, Puja Bharati, Shyamal Kumar Das Mandal:
Speech Enhancement Using LinkNet Architecture. 245-257 - Aniket Aitawade, Puja Bharati, Sabyasachi Chandra, G. Satya Prasad, Debolina Pramanik, Parth Sanjay Khadse, Shyamal Kumar Das Mandal:
ATT:Adversarial Trained Transformer for Speech Enhancement. 258-270 - Daniyar Wolf, Yaroslav Turovsky, Roman V. Meshcheryakov, Anastasia Iskhakova:
Human Identification by Dynamics of Changes in Brain Frequencies Using Artificial Neural Networks. 271-284
Speech Prosody
- Rodmonga Potapova, Vsevolod Potapov, Irina Kuryanova:
Analysis of Formant Trajectories of a Speech Signal for the Purpose of Forensic Identification of a Foreign Speaker. 287-300 - Polina Vasileva, Uliana E. Kochetkova, Pavel A. Skrelin:
Gestures vs. Prosodic Structure in Laboratory Ironic Speech. 301-313 - Priyankoo Sarmah, Wendy Lalhminghlui, Neeraj Kumar Sharma:
Sounds of ence: Acoustics of Inhalation in Read Speech. 314-321 - Natalia Bogdanova-Beglarian, Kristina Zaides, Daria Stoika, Xiaoli Sun:
Prolongations as Hesitation Phenomena in Spoken Speech in First and Second Language. 322-338 - Priyanshi Pal, Shelly Jain, Chiranjeevi Yarra, Prasanta Kumar Ghosh, Anil Kumar Vupalla:
Study of Indian English Pronunciation Variabilities Relative to Received Pronunciation. 339-349 - Olga Iriskhanova, Maria Kiose, Anna Leonteva, Olga Agafonova, Andrey Petrov:
Multimodal Collaboration in Expository Discourse: Verbal and Nonverbal Moves Alignment. 350-363 - Arup Saha, Tulika Basu, Bhaskar Gupta:
Association of Time Domain Features with Oral Cavity Configuration During Vowel Production and Its Application in Vowel Recognition. 364-379 - Anastasia Gorbyleva:
Prosodic Interaction Models in a Conversation. 380-388
Natural Language Processing
- Kirill Apanasovich, Olesia Makhnytkina, Yuri Matveev:
Development and Research of Dialogue Agents with Long-Term Memory and Web Search. 391-401 - Liliya Komalova:
Pre- and Post-Textual Contexts in Assessment of a Message as Offensive or Defensive Aggression Verbalization. 402-414 - Krishnendu Ghosh, Sandipan Mandal, Nilay Roy:
Boosting Rule-Based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in Bengali. 415-429 - Alexandra Vahrusheva, Valery D. Solovyev, Marina Solnyshkina, Elzara Gafiyatova, Svetlana Akhtyamova:
Revisiting Assessment of Text Complexity: Lexical and Syntactic Parameters Fluctuations. 430-441 - Snehal Ranjan, Sai Kalyan Nanduri, Prakul Virdi, Chiranjeevi Yarra:
Analysis of Natural Language Understanding Systems with L2 Learner Specific Synthetic Grammatical Errors Based on Parts-of-Speech. 442-454 - Maria Khokhlova, Olga Blinova, Natalia Bogdanova-Beglarian, Tatiana Y. Sherstinova:
On the Most Frequent Sequences of Words in Russian Spoken Everyday Language (Bigrams and Trigrams): An Experience of Classification. 455-466
Child Speech Processing
- Elena E. Lyakso, Olga V. Frolova, Aleksandr Nikolaev, Egor Kleshnev, Platon Grave, Abylay Ilyas, Olesia Makhnytkina, Ruban Nersisson, A. Mary Mekala, M. Varalakshmi:
Recognition of the Emotional State of Children by Video and Audio Modalities by Indian and Russian Experts. 469-482 - Udara Laxman Kumar, Mikko Kurimo, Hemant Kumar Kathania:
Effect of Linear Prediction Order to Modify Formant Locations for Children Speech Recognition. 483-493 - Syed Shahnawazuddin, Ankita, Avinash Kumar, Hemant Kumar Kathania:
Gammatone-Filterbank Based Pitch-Normalized Cepstral Coefficients for Zero-Resource Children's ASR. 494-505 - Soma Khan, Tulika Basu, Joyanta Basu, Madhab Pal, Rajib Roy:
System Assisted Vocal Response Analysis and Assessment of Autism in Children: A Machine Learning Based Approach. 506-519 - Jayant Kumar Rout, Gayadhar Pradhan:
Addressing Effects of Formant Dispersion and Pitch Sensitivity for the Development of Children's KWS System. 520-534 - Elena E. Lyakso, Olga V. Frolova, Aleksandr Nikolaev, Severin Grechanyi, Anton Matveev, Yuri Matveev, Olesia Makhnytkina, Ruban Nersisson:
Emotional State of Children with ASD and Intellectual Disabilities: Perceptual Experiment and Automatic Recognition by Video, Audio and Text Modalities. 535-549 - S. Uthiraa, Aastha Kachhi, Hemant A. Patil:
Linear Frequency Residual Features for Infant Cry Classification. 550-561
Speech Processing for Medicine
- Sharal Coelho, Hosahalli Lakshmaiah Shashirekha:
Identification of Voice Disorders: A Comparative Study of Machine Learning Algorithms. 565-578 - Siddharth Rathod, Monil Charola, Hemant A. Patil:
Transfer Learning Using Whisper for Dysarthric Automatic Speech Recognition. 579-589 - Oindrila Banerjee, D. Govind, Suryakanth V. Gangashetty, Akhilesh Kumar Dubey, Rajeev Aravindakshan, Sasikumar Panicker, K. Reshma:
Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury. 590-600 - Dariya Novokhrestova, Evgeny Kostyuchenko, Ilya Krivoshein, Lidiya N. Balatskaya:
Speech Signal Segmentation into Silence, Unvoiced and Vocalized Sections in Speech Rehabilitation. 601-610 - Chandra Mohan Bhuma:
Respiratory Sickness Detection from Audio Recordings Using CLIP Models. 611-625 - Rohan Kumar Gupta, Rohit Sinha:
Investigating the Effect of Data Impurity on the Detection Performances of Mental Disorders Through Spoken Dialogues. 626-637
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.