SP, IPSJ-MUS, IPSJ-SLP [detail] 2022-06-17
Online Online Study of End-to-End Text-to-Speech that can seamlessly control speaker's individuality by Manipulating Speaker features
Naoki Aotani, Sunao Hara, Msanobu Abe (Okayama Univ) SP2022-14
In this paper, we investigate an End-to-End speech synthesis scheme that enables to seamlessly control speaker individua... [more] SP2022-14
SP, IPSJ-MUS, IPSJ-SLP [detail] 2022-06-18
Online Online [Invited Talk] Crazy vocoder is unbreakable -- But let's talk about an informal vision of the future --
Masanori Morise (Meiji Univ.) SP2022-15
When current speech synthesis researchers refer to Vocoder in their papers, they are most likely referring to Neural voc... [more] SP2022-15
SP, IPSJ-MUS, IPSJ-SLP [detail] 2022-06-18
Online Online [Poster Presentation] Worker Filtering Criteria for Subjective Evaluation of Synthesized Voice Sound Quality Using Crowdsourcing
Moe Yaegashi (Waseda Univ.), Susumu Saito, Teppei Nakano (Waseda Univ./ifLab.), Tetsuji Ogawa (Waseda Univ.) SP2022-24
We investigate the effect of filtering criteria of crowdworkers on the subjective evaluation results of synthesized voi... [more] SP2022-24
EA, SIP, SP, IPSJ-SLP [detail] 2022-03-02
(Primary: On-site, Secondary: Online)
Evaluation of sentence-level generation in Japanese dialect speech synthesis using accent latent variables
Kazuya Yufune, Tomoki Koriyama, Shinnosuke Takamichi, Hiroshi Saruwatari (UTokyo) EA2021-79 SIP2021-106 SP2021-64
Japanese dialect speech synthesis is useful for personalized speech synthesis systems. However, inability to prepare acc... [more] EA2021-79 SIP2021-106 SP2021-64
Kumamoto Sojo University [Poster Presentation] Improved voice quality due to multi-speaker learning with WaveNet vocoder
Satoshi Yoshida, Shingo Uenohara, Ken'ichi Furuya (Oita Univ.) EA2021-57
In recent years, speech synthesis and voice quality conversion techniques using neural networks have attracted much atte... [more] EA2021-57
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] 2021-12-02
Online Online improvement of multilingual speech emotion recognition by normalizing features using CRNN
Jinhai Qi, Motoyuki Suzuki (OIT) NLC2021-22 SP2021-43
In this research, a new multilingual emotion recognition method by normalizing features using CRNN has been proposed. We... [more] NLC2021-22 SP2021-43
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] 2021-12-03
Online Online Multi-speaker Audiobook Speech Synthesis using Discrete Character Acting Styles Acquired by VQVAE
Wataru Nakata, Tomoki Koriyama, Shinnosuke Takamichi, Yuki Saito (UT), Yusuke Ijima, Ryo Masumura (NTT), Hiroshi Saruwatari (UT) NLC2021-26 SP2021-47
In this paper, we propose a method of extracting discrete character acting styles using vector quantized variational aut... [more] NLC2021-26 SP2021-47
SP, WIT, IPSJ-SLP, ASJ-H [detail] 2021-10-19
Online Online A study on model training for DNN-HSMM-based speech synthesis using a large-scale speech corpus
Nobuyuki Nishizawa, Gen Hattori (KDDI Research) SP2021-34 WIT2021-27
In this study, an investigation into model training for DNN-HSMM-based speech synthesis using a large speech corpus coll... [more] SP2021-34 WIT2021-27
EA, ASJ-H 2021-07-16
Online Online A study on the online speech data collection for speech synthesis
Yuya Hoshiko, Naofumi Aoki, Kosei Ozeki, Yoshinori Dobashi (Hokkaido Univ.) EA2021-15
There are high expectations for text-to-speech systems for people who are unable to speak with their own voice, such as ... [more] EA2021-15
EA, ASJ-H 2021-07-16
Online Online A study on the number of speech samples required for making acoustic models in tailor-made speech synthesis
Keigo Narita, Naofumi Aoki, Atsuhito Udo, Yoshinori Dobashi (Hokkaido Univ.) EA2021-16
In this study, we created speaker dependent acoustic models with varying numbers of samples, and confirmed differences i... [more] EA2021-16
SP, IPSJ-SLP, IPSJ-MUS 2021-06-19
Online Online [Invited Talk] Toward a Unification of Various Speech Processing Tasks Based on End-to-End Neural networks
Shinji Watanabe (CMU) SP2021-8
This presentation will introduce the recent progress of speech processing technologies based on end-to-end neural networ... [more] SP2021-8
SP, IPSJ-SLP, IPSJ-MUS 2021-06-19
Online Online Creating of Japanese Phoneme Balanced Sentences for Speech Synthesis
Yuko Takai, Naofumi Aoki, Yoshinori Dobashi (Hokkaido Univ.) SP2021-9
When the loss of voice is inevitable due to pharyngectomy or other reasons, it has become possible to realizespeech synt... [more] SP2021-9
SP, IPSJ-SLP, IPSJ-MUS 2021-06-19
Online Online A Study on Error Correction for Improving the Accuracy of Acoustic Models
Saki Anazawa, Naofumi Aoki, Yoshinori Dobashi (Hokkaido Univ.) SP2021-12
People with ALS (amyotrophic lateral sclerosis) or dysarthria sometimes use their own voice for speech synthesis. In thi... [more] SP2021-12
SP, IPSJ-SLP, IPSJ-MUS 2021-06-19
Online Online Simulation of Body-conducted Speech and Synthesis of One's Own Voice with a Sound-proof Earmuff and Bone-conduction Microphones
Chen Ruiyan, Nishimura Tazuko, Minematsu Nobuaki, Saito Daisuke (UTokyo) SP2021-15
When one hears his/her recorded voices for the first time, s/he is probably surprised and not rarely disappointed at the... [more] SP2021-15
SP, IPSJ-SLP, IPSJ-MUS 2021-06-19
Online Online Dynamic Display of Guidelines in Interactive Speech Synthesizer
Daiki Goto (Hokkai Gakuen Univ.), Naofumi Aoki, Keisuke ai (Hokkaido Univ.), Kunitoshi Motoki (Hokkai Gakuen Univ.) SP2021-18
We are developing a speech synthesis system that can play sounds by interactive control, just like playing a musical ins... [more] SP2021-18
SP, IPSJ-SLP, IPSJ-MUS 2021-06-19
Online Online Preliminary study on synthesizing relaxing voices -- from a perspective of recognized/evoked emotions and acoustic features --
Yuki Watanabe, Shuichi Sakamoto (Tohoku Univ.), Takayuki Hoshi, Yoshiki Nagatani, Manabu Nakano (Pixie Dust Technologies) SP2021-19
The goal of this study is to synthesize speech sound which induces relaxed emotion. As the preliminary study, we investi... [more] SP2021-19
SP, IPSJ-SLP, IPSJ-MUS 2021-06-19
Online Online Neural speech synthesis using local phrase dependency structure information
Nobuyoshi Kaiki, Sakriani Sakti, Satoshi Nakamura (NIST) SP2021-23
In order to synthesize Japanese speech with natural prosody, we introduce an end-to-end TTS with new prosodic symbol rep... [more] SP2021-23
WIT 2021-06-01
Online Online The relationship between speech rate and environmental noise in synthesized speech for easy listening of movie audio discription
Takeya Naono, Sawako Nakajima, Kazutaka Mitobe (Akita Univ) WIT2021-8
In recent years, speech synthesis has been used for audio description of movies and videos, and there is a need to impro... [more] WIT2021-8
SIS 2021-03-04
Online Online Optimization source-filtere based speech waveform generation using adversarial training
Hayato Mitsui, Yosuke Sugiura, Nozomiko Yasui, Tetsuya Shimamura (Saitama Univ.) SIS2020-35
This research aims to improve the accuracy of the source-filter based speech waveform generation model using deep learni... [more] SIS2020-35
EA, US, SP, SIP, IPSJ-SLP [detail] 2021-03-03
Online Online [Poster Presentation] A unified source-filter network for neural vocoder
Reo Yoneyama, Yi-Chiao Wu, Tomoki Toda (Nagoya Univ.) EA2020-69 SIP2020-100 SP2020-34
In this paper, we propose a method to develop a neural vocoder using a single network based on the source-filter theory.... [more] EA2020-69 SIP2020-100 SP2020-34
