default search action

combined dblp search
author search
venue search
publication search

ask others

12th SSW 2023: Grenoble, France

> Home > Conferences and Workshops > SSW

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/2023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/2023
Gérard Bailly, Thomas Hueber, Damien Lolive, Nicolas Obin, Olivier Perrotin:
12th ISCA Speech Synthesis Workshop, SSW 2023, Grenoble, France, August 26-28, 2023. ISCA 2023

Orals 1: TTS input

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/BaillyLPK23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/BaillyLPK23
Gérard Bailly, Martin Lenglet, Olivier Perrotin, Esther Klabbers:
Advocating for text input in multi-speaker text-to-speech systems. 1-7
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/FongT023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/FongT023
Jason Fong, Hao Tang, Simon King:
Spell4TTS: Acoustically-informed spellings for improving text-to-speech pronunciations. 8-13
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/MoyaKKSPMD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/MoyaKKSPMD23
Marcel Granero Moya, Penny Karanasou, Sri Karlapati, Bastian Schnell, Nicole Peinelt, Alexis Moinet, Thomas Drugman:
A Comparative Analysis of Pretrained Language Models for Text-to-Speech. 14-20
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/DoCDK23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/DoCDK23
Phat Do, Matt Coler, Jelske Dijkstra, Esther Klabbers:
Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection. 21-26

Orals 2: Evaluation

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/FinkelsteinCC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/FinkelsteinCC23
Lev Finkelstein, Joshua Camp, Rob Clark:
Importance of Human Factors in Text-To-Speech Evaluations. 27-33
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/SeebauerKHW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/SeebauerKHW23
Fritz Seebauer, Michael Kuhlmann, Reinhold Haeb-Umbach, Petra Wagner:
Re-examining the quality dimensions of synthetic speech. 34-40
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/KirklandMLHSG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/KirklandMLHSG23
Ambika Kirkland, Shivam Mehta, Harm Lameris, Gustav Eje Henter, Éva Székely, Joakim Gustafson:
Stuck in the MOS pit: A critical analysis of MOS test methodology in TTS evaluation. 41-47
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/PlatekD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/PlatekD23
Ondrej Plátek, Ondrej Dusek:
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module. 48-54

Orals 3: Beyond text-to-speech

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/Louw23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/Louw23
Johannes A. Louw:
Cross-lingual transfer using phonological features for resource-scarce text-to-speech. 55-61
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/MatsunagaSTS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/MatsunagaSTS23
Yuta Matsunaga, Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Improving robustness of spontaneous speech synthesis with linguistic speech regularization and pseudo-filled-pause insertion. 62-68
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/LamerisKGS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/LamerisKGS23
Harm Lameris, Ambika Kirkland, Joakim Gustafson, Éva Székely:
Situating Speech Synthesis: Investigating Contextual Factors in the Evaluation of Conversational TTS. 69-74
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/OMahonyL023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/OMahonyL023
Johannah O'Mahony, Catherine Lai, Simon King:
Synthesising turn-taking cues using natural conversational data. 75-80

Orals 4: Voice conversion

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/DasGPSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/DasGPSS23
Arnab Das, Suhita Ghosh, Tim Polzehl, Ingo Siegert, Sebastian Stober:
StarGAN-VC++: Towards Emotion Preserving Voice Conversion Using Deep Embeddings. 81-87
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/TanakaKK23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/TanakaKK23
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko:
PRVAE-VC: Non-Parallel Many-to-Many Voice Conversion with Perturbation-Resistant Variational Autoencoder. 88-93
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/HiraiSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/HiraiSS23
Ryunosuke Hirai, Yuki Saito, Hiroshi Saruwatari:
Federated Learning for Human-in-the-Loop Many-to-Many Voice Conversion. 94-99
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/KashkinKS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/KashkinKS23
Anton Kashkin, Ivan Karpukhin, Svyatoslav Shishkin:
HiFi-VC: High Quality ASR-based Voice Conversion. 100-105

Orals 5: Expressivity, emotion and styles

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/DiatlovaS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/DiatlovaS23
Daria Diatlova, Vitalii Shutov:
EmoSpeech: guiding FastSpeech2 towards Emotional Text to Speech. 106-112
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/JolyNPLAKHSMLKB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/JolyNPLAKHSMLKB23
Arnaud Joly, Marco Nicolis, Ekaterina Peterova, Alessandro Lombardi, Ammar Abbas, Arent van Korlaar, Aman Hussain, Parul Sharma, Alexis Moinet, Mateusz Lajszczak, Penny Karanasou, Antonio Bonafonte, Thomas Drugman, Elena Sokolova:
Controllable Emphasis with zero data for text-to-speech. 113-119
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/LengletPB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/LengletPB23
Martin Lenglet, Olivier Perrotin, Gérard Bailly:
Local Style Tokens: Fine-Grained Prosodic Representations For TTS Expressive Control. 120-126
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/KakourosSVS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/KakourosSVS23
Sofoklis Kakouros, Juraj Simko, Martti Vainio, Antti Suni:
Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody. 127-133

Orals 6: Long form, multimodal & multi-speaker TTS

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/StanO23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/StanO23
Adriana Stan, Johannah O'Mahony:
An analysis on the effects of speaker embedding choice in non auto-regressive TTS. 134-138
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/ZhangYBRRGW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/ZhangYBRRGW23
Weicheng Zhang, Cheng-chieh Yeh, Will Beckman, Tuomo Raitio, Ramya Rasipuram, Ladan Golipour, David Winarsky:
Audiobook synthesis with long-form neural text-to-speech. 139-143
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/RaitioLDMG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/RaitioLDMG23
Tuomo Raitio, Javier Latorre, Andrea Davis, Tuuli Morrill, Ladan Golipour:
Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling. 144-149
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/MehtaWABSH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/MehtaWABSH23
Shivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter:
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis. 150-156

Posters SSW

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/ChenG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/ChenG23
Haolin Chen, Philip N. Garner:
Diffusion Transformer for Adaptive Text-to-Speech. 157-162
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/WangHGS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/WangHGS23
Siyang Wang, Gustav Eje Henter, Joakim Gustafson, Éva Székely:
On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis. 163-169
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/GuennecWSBL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/GuennecWSBL23
David Guennec, Lily Wadoux, Aghilas Sini, Nelly Barbot, Damien Lolive:
Voice Cloning: Training Speaker Selection with Limited Multi-Speaker Corpus. 170-176
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/ShankarV23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/ShankarV23
Ravi Shankar, Archana Venkataraman:
Adaptive Duration Modification of Speech using Masked Convolutional Networks and Open-Loop Time Warping. 177-183
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/DuretEP23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/DuretEP23
Jarod Duret, Yannick Estève, Titouan Parcollet:
Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data. 184-190
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/Lakshminarayana23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/Lakshminarayana23
Kishor Kayyar Lakshminarayana, Christian Dittmar, Nicola Pia, Emanuël A. P. Habets:
Subjective Evaluation of Text-to-Speech Models: Comparing Absolute Category Rating and Ranking by Elimination Tests. 191-196
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/Shirali-Shahreza23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/Shirali-Shahreza23
Sajad Shirali-Shahreza, Gerald Penn:
Better Replacement for TTS Naturalness Evaluation. 197-203
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/ElmersS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/ElmersS23
Mikey Elmers, Éva Székely:
The Impact of Pause-Internal Phonetic Particles on Recall in Synthesized Lectures. 204-210
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/YoshimuraFOT23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/YoshimuraFOT23
Takenori Yoshimura, Takato Fujimoto, Keiichiro Oura, Keiichi Tokuda:
SPTK4: An Open-Source Software Toolkit for Speech Signal Processing. 211-217
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/FinkelsteinCWZC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/FinkelsteinCWZC23
Lev Finkelstein, Chun-an Chan, Vincent Wan, Heiga Zen, Rob Clark:
FiPPiE: A Computationally Efficient Differentiable method for Estimating Fundamental Frequency From Spectrograms. 218-224
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/VecinoGMPICL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/VecinoGMPICL23
Biel Tura Vecino, Adam Gabrys, Daniel Matwicki, Andrzej Pomirski, Tom Iddon, Marius Cotescu, Jaime Lorenzo-Trueba:
Lightweight End-to-end Text-to-speech Synthesis for low resource on-device applications. 225-229
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ssw/IbrahimovGC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/IbrahimovGC23
Ibrahim Ibrahimov, Gábor Gosztolya, Tamás Gábor Csapó:
Data Augmentation Methods on Ultrasound Tongue Images for Articulation-to-Speech Synthesis. 230-235

Late breaking reports (not peer reviewed)

- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ssw/AlwaisiAN23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/AlwaisiAN23
Shaimaa Alwaisi, Mohammed Salah Al-Radhi, Géza Németh:
Universal Approach to Multilingual Multispeaker Child Speech SynthesisUniversal Approach to Multilingual Multispeaker Child Speech Synthesis. 236-237
- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ssw/FongMEB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/FongMEB23
Seraphina Fong, Marco Matassoni, Gianluca Esposito, Alessio Brutti:
Towards Speaker-Independent Voice Conversion for Improving Dysarthric Speech Intelligibility. 238-239
- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ssw/JacquelinGGVP23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/JacquelinGGVP23
Maxime Jacquelin, Maeva Garnier, Laurent Girin, Rémy Vincent, Olivier Perrotin:
Exploring the multidimensional representation of individual speech acoustic parameters extracted by deep unsupervised models. 240-241
- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ssw/LiGNC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/LiGNC23
Zhu Li, Xiyuan Gao, Shekhar Nayak, Matt Coler:
SarcasticSpeech: Speech Synthesis for Sarcasm in Low-Resource Scenarios. 242-243
- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ssw/SandersR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/SandersR23
Nicholas Sanders, Korin Richmond:
Recovering Discrete Prosody Inputs via Invert-Classify. 244-245
- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ssw/Sigurgeirsson023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/Sigurgeirsson023
Atli Sigurgeirsson, Simon King:
Using a Large Language Model to Control Speaking Style for Expressive TTS. 246-247
- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ssw/StricklandADTE23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ssw/StricklandADTE23
Emmett Strickland, Dana Aubakirova, Dorin Doncenco, Diego Torres, Marc Evrard:
NaijaTTS: A pitch-controllable TTS model for Nigerian Pidgin. 248-249

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.