NLC, IPSJ-NL, SP, IPSJ-SLP [detail] 2021-12-03
Online Online Multi-speaker Audiobook Speech Synthesis using Discrete Character Acting Styles Acquired by VQVAE
Wataru Nakata, Tomoki Koriyama, Shinnosuke Takamichi, Yuki Saito (UT), Yusuke Ijima, Ryo Masumura (NTT), Hiroshi Saruwatari (UT) NLC2021-26 SP2021-47
In this paper, we propose a method of extracting discrete character acting styles using vector quantized variational aut... [more] NLC2021-26 SP2021-47
EA, US, SP, SIP, IPSJ-SLP [detail] 2021-03-03
Online Online [Poster Presentation] Investigation of DNN-based speech synthesis utilizing oral reading skills obtained from large scale subjective evaluation
Shun Akui (UTokyo), Yusuke Ijima (NTT), Daisuke Saito, Nobuaki Minematsu (UTokyo) EA2020-71 SIP2020-102 SP2020-36
So far, we have been suggested the value of `oral reading skill' based on a listening evaluation experiment as a quantit... [more] EA2020-71 SIP2020-102 SP2020-36
EA, US, SP, SIP, IPSJ-SLP [detail] 2021-03-03
Online Online An investigation of rhythm-based speaker embeddings for phoneme duration modeling
Kenichi Fujita, Atsushi Ando, Yusuke Ijima (NTT) EA2020-77 SIP2020-108 SP2020-42
In this study, we propose a speaker embedding method suitable for modeling phoneme duration length for each individual i... [more] EA2020-77 SIP2020-108 SP2020-42
SP, EA, SIP 2020-03-02
Okinawa Okinawa Industry Support Center
(Cancelled but technical report was issued)
The Effectiveness of Additional Context in DNN-based Spontaneous Speech Synthesis
Yuki Yamashita, Tomoki Koriyama, Yuki Saito, Shinnosuke Takamichi (UTokyo), Yusuke Ijima, Ryo Masumura (NTT), Hiroshi Saruwatari (UTokyo) EA2019-112 SIP2019-114 SP2019-61
In DNN-based speech synthesis, contexts, which are input features of DNN, can be used not only for the representation of... [more] EA2019-112 SIP2019-114 SP2019-61
SP, EA, SIP 2020-03-03
Okinawa Okinawa Industry Support Center
(Cancelled but technical report was issued)
[Poster Presentation] Initial analysis of oral reading skills obtained from large scale subjective evaluation
Takuya Ozuru (Univ. of Tokyo), Yusuke Ijima (NTT), Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) EA2019-135 SIP2019-137 SP2019-84
Speech of professional newscasters easily suggest us his/her occupation, that is newscaster. So far, we have analyzed pr... [more] EA2019-135 SIP2019-137 SP2019-84
SP 2019-08-28
Kyoto Kyoto Univ. [Poster Presentation] Analysis of prosodic differences between a newscaster and amateur speakers using partial-substituted synthetic speech
Takuya Ozuru (Univ. of Tokyo), Yusuke Ijima (NTT), Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) SP2019-11
This paper analyzes prosodic differences between a professional newscaster and amateur speakers which affects listeners’... [more] SP2019-11
WIT, SP 2018-10-27
Fukuoka Kyushu Institute of Technology(Kitakyushu) An investigation of multi-speaker modeling for DNN-based speech synthesis incorporating generative adversarial networks
Hiroki Kanagawa, Yusuke Ijima (NTT MD Lab.) SP2018-32 WIT2018-20
 [more] SP2018-32 WIT2018-20
Shizuoka Sago-Royal-Hotel (Hamamatsu) [Invited Talk] Docomo AI Agent: Open Partner Initiative -- Project SEBASTIEN --
Takanobu Oba, Takashi Yoshikawa (Docomo), Takaaki Fukutomi, Kiyoaki Matsui, Yusuke Ijima (NTT) SP2018-17
 [more] SP2018-17
(Joint) [detail]
Okinawa   Non-parallel and Many-to-Many Voice Conversion Using Variational Autoencoder Conditioned by Phonetic Posteriorgrams and d-vectors
Yuki Saito (NTT/Univ. of Tokyo), Yusuke Ijima, Kyosuke Nishida (NTT), Shinnosuke Takamichi (Univ. of Tokyo) EA2017-105 SIP2017-114 SP2017-88
This paper proposes novel frameworks for non-parallel and many-to-many voice conversion (VC) using variational autoencod... [more] EA2017-105 SIP2017-114 SP2017-88
PRMU, SP 2017-06-22
Miyagi   Comparisons on Transplant Emotional Expressions in DNN-based TTS Synthesis
Katsuki Inoue, Sunao Hara, Masanobu Abe (Okayama Univ.), Nobukatsu Hojo, Yusuke Ijima (NTT) PRMU2017-29 SP2017-5
Recent studies have shown that DNN-based speech synthesis can generate more natural synthesized speech than the conventi... [more] PRMU2017-29 SP2017-5
SP, SIP, EA 2017-03-01
Okinawa Okinawa Industry Support Center [Poster Presentation] An investigation of speaker adaptation method for DNN-based speech synthesis using speaker codes
Nobukatsu Hojo, Yusuke Ijima (NTT) EA2016-108 SIP2016-163 SP2016-103
In this work, we conducted objective evaluation experiments on the conventional speaker adaptation methods for DNN-based... [more] EA2016-108 SIP2016-163 SP2016-103
SP, SIP, EA 2017-03-01
Okinawa Okinawa Industry Support Center [Poster Presentation] Prosodic Word Embeddings for DNN-based speech synthesis
Yusuke Ijima, Nobukatsu Hojo, Ryo Masumura, Taichi Asami (NTT) EA2016-109 SIP2016-164 SP2016-104
This paper proposed a novel word embeddings with prosodic information (prosodic word embeddings) for DNN-based speech sy... [more] EA2016-109 SIP2016-164 SP2016-104
(Joint) [detail]
Tokyo NTT Musashino R&D Generative Adversarial Network-based Postfiltering for Statistical Parametric Speech Synthesis
Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo, Yusuke Ijima, Kaoru Hiramatsu, Kunio Kashino (NTT) SP2016-61
In the field of speech synthesis, statistical parametric speech synthesis has been widely used due to the flexibility an... [more] SP2016-61
Yamagata Takinoyu Hotel On the Use of Speaker Codes for Multi-Speaker Modeling in DNN-based Speech Synthesis
Nobukatsu Hojo, Yusuke Ijima (NTT), Hideyuki Mizuno (Tokyo University of Science, Suwa) SP2016-22
Recent studies have shown that DNN-based speech synthesis can generate more natural synthesized speech than the conventi... [more] SP2016-22
SP 2016-01-14
Kanagawa Sunpian Kawasaki Objective evaluation of synthetic speech using association between dimensions within spectral features
Yusuke Ijima, Taichi Asami (NTT), Hideyuki Mizuno (TUSS) SP2015-90
This paper proposes a novel objective evaluation technique for statistical parametric speech synthesis. A novel point of... [more] SP2015-90
SP 2013-01-31
Kyoto Doshisha Univ. A Study on Multi-class Local Prosodic Context for Expressive Prosody Generation
Yu Maeno, Takashi Nose, Takao Kobayashi, Tomoki Koriyama (Tokyo Inst. of Tech.), Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka (NTT) SP2012-112
This paper describes a technique for reproducing local prosodic variability which appears in expressive speech including... [more] SP2012-112
SP 2012-06-14
Kanagawa NTT Atsugi R&D Center A Study on Automatic Prosodic Context Labeling for Emphatic Speech Synthesis
Yu Maeno, Takashi Nose, Takao Kobayashi (Tokyo Tech), Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka (NTT) SP2012-33
This paper describes automatic prosodic context labeling of training data for synthesizing expressive speech in HMM-base... [more] SP2012-33
SP 2012-06-15
Kanagawa NTT Atsugi R&D Center Analysis of the correlation between various acoustic features and the audibility of speech with noise
Hosana Kamiyama, Yusuke Ijima, Mitsuaki Isogai, Hideyuki Mizuno (NTT) SP2012-46
This paper addresses the correlation analysis of acoustic features with the audibility of naturally uttered speech with ... [more] SP2012-46
SP 2009-07-18
Fukushima   Speaking Style Classification of Spontaneous Speech Using Multiple-Regression HMM
Takashi Nose, Takeshi Matsubara, Yusuke Ijima, Takao Kobayashi (Tokyo Inst. of Tech.) SP2009-46
This paper describes speaking style classification and speech recognition for spontaneous speech based on multiple-regre... [more] SP2009-46
SP, NLC 2008-12-09
Tokyo Waseda Univ. Acoustic Model Training Technique for Speech Recognition using Style Estimation with Multiple-Regression HMM
Yusuke Ijima, Makoto Tachibana, Takashi Nose, Takao Kobayashi (Tokyo Tech) NLC2008-30 SP2008-85
We propose a technique for emotional speech recognition based on multiple-regression HMM (MRHMM). To achieve emotional s... [more] NLC2008-30 SP2008-85
