- 2023
- Shaimaa Alwaisi, Mohammed Salah Al-Radhi, Géza Németh:
Universal Approach to Multilingual Multispeaker Child Speech SynthesisUniversal Approach to Multilingual Multispeaker Child Speech Synthesis. SSW 2023: 236-237 - Gérard Bailly, Martin Lenglet, Olivier Perrotin, Esther Klabbers:
Advocating for text input in multi-speaker text-to-speech systems. SSW 2023: 1-7 - Haolin Chen, Philip N. Garner:
Diffusion Transformer for Adaptive Text-to-Speech. SSW 2023: 157-162 - Arnab Das, Suhita Ghosh, Tim Polzehl, Ingo Siegert, Sebastian Stober:
StarGAN-VC++: Towards Emotion Preserving Voice Conversion Using Deep Embeddings. SSW 2023: 81-87 - Daria Diatlova, Vitalii Shutov:
EmoSpeech: guiding FastSpeech2 towards Emotional Text to Speech. SSW 2023: 106-112 - Phat Do, Matt Coler, Jelske Dijkstra, Esther Klabbers:
Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection. SSW 2023: 21-26 - Jarod Duret, Yannick Estève, Titouan Parcollet:
Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data. SSW 2023: 184-190 - Mikey Elmers, Éva Székely:
The Impact of Pause-Internal Phonetic Particles on Recall in Synthesized Lectures. SSW 2023: 204-210 - Lev Finkelstein, Joshua Camp, Rob Clark:
Importance of Human Factors in Text-To-Speech Evaluations. SSW 2023: 27-33 - Lev Finkelstein, Chun-an Chan, Vincent Wan, Heiga Zen, Rob Clark:
FiPPiE: A Computationally Efficient Differentiable method for Estimating Fundamental Frequency From Spectrograms. SSW 2023: 218-224 - Seraphina Fong, Marco Matassoni, Gianluca Esposito, Alessio Brutti:
Towards Speaker-Independent Voice Conversion for Improving Dysarthric Speech Intelligibility. SSW 2023: 238-239 - Jason Fong, Hao Tang, Simon King:
Spell4TTS: Acoustically-informed spellings for improving text-to-speech pronunciations. SSW 2023: 8-13 - David Guennec, Lily Wadoux, Aghilas Sini, Nelly Barbot, Damien Lolive:
Voice Cloning: Training Speaker Selection with Limited Multi-Speaker Corpus. SSW 2023: 170-176 - Ryunosuke Hirai, Yuki Saito, Hiroshi Saruwatari:
Federated Learning for Human-in-the-Loop Many-to-Many Voice Conversion. SSW 2023: 94-99 - Ibrahim Ibrahimov, Gábor Gosztolya, Tamás Gábor Csapó:
Data Augmentation Methods on Ultrasound Tongue Images for Articulation-to-Speech Synthesis. SSW 2023: 230-235 - Maxime Jacquelin, Maeva Garnier, Laurent Girin, Rémy Vincent, Olivier Perrotin:
Exploring the multidimensional representation of individual speech acoustic parameters extracted by deep unsupervised models. SSW 2023: 240-241 - Arnaud Joly, Marco Nicolis, Ekaterina Peterova, Alessandro Lombardi, Ammar Abbas, Arent van Korlaar, Aman Hussain, Parul Sharma, Alexis Moinet, Mateusz Lajszczak, Penny Karanasou, Antonio Bonafonte, Thomas Drugman, Elena Sokolova:
Controllable Emphasis with zero data for text-to-speech. SSW 2023: 113-119 - Sofoklis Kakouros, Juraj Simko, Martti Vainio, Antti Suni:
Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody. SSW 2023: 127-133 - Anton Kashkin, Ivan Karpukhin, Svyatoslav Shishkin:
HiFi-VC: High Quality ASR-based Voice Conversion. SSW 2023: 100-105 - Ambika Kirkland, Shivam Mehta, Harm Lameris, Gustav Eje Henter, Éva Székely, Joakim Gustafson:
Stuck in the MOS pit: A critical analysis of MOS test methodology in TTS evaluation. SSW 2023: 41-47 - Kishor Kayyar Lakshminarayana, Christian Dittmar, Nicola Pia, Emanuël A. P. Habets:
Subjective Evaluation of Text-to-Speech Models: Comparing Absolute Category Rating and Ranking by Elimination Tests. SSW 2023: 191-196 - Harm Lameris, Ambika Kirkland, Joakim Gustafson, Éva Székely:
Situating Speech Synthesis: Investigating Contextual Factors in the Evaluation of Conversational TTS. SSW 2023: 69-74 - Martin Lenglet, Olivier Perrotin, Gérard Bailly:
Local Style Tokens: Fine-Grained Prosodic Representations For TTS Expressive Control. SSW 2023: 120-126 - Zhu Li, Xiyuan Gao, Shekhar Nayak, Matt Coler:
SarcasticSpeech: Speech Synthesis for Sarcasm in Low-Resource Scenarios. SSW 2023: 242-243 - Johannes A. Louw:
Cross-lingual transfer using phonological features for resource-scarce text-to-speech. SSW 2023: 55-61 - Yuta Matsunaga, Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Improving robustness of spontaneous speech synthesis with linguistic speech regularization and pseudo-filled-pause insertion. SSW 2023: 62-68 - Shivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter:
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis. SSW 2023: 150-156 - Marcel Granero Moya, Penny Karanasou, Sri Karlapati, Bastian Schnell, Nicole Peinelt, Alexis Moinet, Thomas Drugman:
A Comparative Analysis of Pretrained Language Models for Text-to-Speech. SSW 2023: 14-20 - Johannah O'Mahony, Catherine Lai, Simon King:
Synthesising turn-taking cues using natural conversational data. SSW 2023: 75-80 - Ondrej Plátek, Ondrej Dusek:
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module. SSW 2023: 48-54