Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-02-28 10:10 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Singing voice synthesis based on a frame-driven attention mechanism considering vocal timing deviation Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda (NITech) EA2022-78 SIP2022-122 SP2022-42 |
This paper proposes singing voice synthesis (SVS) based on a frame-driven attention mechanism considering vocal timing d... [more] |
EA2022-78 SIP2022-122 SP2022-42 pp.19-24 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2019-12-06 10:35 |
Tokyo |
NHK Science & Technology Research Labs. |
[Invited Talk]
Progress and prospects of statistical speech synthesis Keiichi Tokuda (Nagoya Inst. of Tech.) SP2019-35 |
The basic problem of statistical speech synthesis is quite simple: we have a speech database for training, i.e., a set o... [more] |
SP2019-35 pp.11-12 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2019-12-06 13:55 |
Tokyo |
NHK Science & Technology Research Labs. |
[Poster Presentation]
Synthetic speech-based sound masking for privacy protection when speaking to smartphones in public space Takahiro Tsugui, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2019-38 |
In this paper, we propose a synthetic speech-based sound masking method that protects the privacy when speaking to smart... [more] |
SP2019-38 pp.55-60 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2019-12-06 16:00 |
Tokyo |
NHK Science & Technology Research Labs. |
A comparison of neural vocoders in singing voice synthesis Sota Wada, Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2019-42 |
In this study, we compare five types of vocoders based on neural networks (neural vocoders) for singing voice synthesis.... [more] |
SP2019-42 pp.85-90 |
PRMU, SP |
2018-06-29 11:00 |
Nagano |
|
Speaker adaptation in speech synthesis based on neural networks including temporal structure modeling Kento Nakao, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (NIT) PRMU2018-31 SP2018-11 |
This paper proposes a speaker adaptation technique for speech synthesis based on deep neural networks (DNNs) using a str... [more] |
PRMU2018-31 SP2018-11 pp.53-58 |
SP, ASJ-H |
2018-01-20 14:55 |
Tokyo |
The University of Tokyo |
[Poster Presentation]
TRAJECTORY TRAINING CONSIDERING POWER FOR SPEECH SYNTHESIS BASED ON NEURAL NETWORKS Ryohei Funato, Kei Hashimoto, keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2017-74 |
In statistical parametric speech synthesis, a relation between acoustic features and linguistic features is modeled by s... [more] |
SP2017-74 pp.43-48 |
SP, ASJ-H |
2018-01-21 15:35 |
Tokyo |
The University of Tokyo |
Mel-cepstrum based quantization noise shaping applied to speech synthesis based on WaveNet Takenori Yoshimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2017-83 |
This paper proposes a mel-cepstrum based quantization noise shaping for improving the quality of synthetic speech genera... [more] |
SP2017-83 pp.93-98 |
SP, ASJ-H |
2018-01-21 16:00 |
Tokyo |
The University of Tokyo |
A study on voice conversion based on WaveNet Jumpei Niwa, Takenori Yoshimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (NIT) SP2017-84 |
This paper proposes a voice conversion technique based on WaveNet to directly generate target audio waveforms from acous... [more] |
SP2017-84 pp.99-104 |
SP |
2017-01-21 11:00 |
Tokyo |
The University of Tokyo |
[Poster Presentation]
Designing linguistic features for expressive speech synthesis using audiobooks Chiaki Asai, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2016-70 |
In order to synthesize expressive speech, various statistical parametric speech synthesis systems have been proposed. Sp... [more] |
SP2016-70 pp.35-40 |
SP |
2017-01-21 16:35 |
Tokyo |
The University of Tokyo |
Simultaneous modeling of acoustic feature sequences and its temporal structures for DNN-based speech synthesis Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2016-76 |
In statistical parametric speech synthesis, a hidden Markov model (HMM) is widely used as an acoustic model. Recently, d... [more] |
SP2016-76 pp.71-76 |
PRMU, SP, WIT, ASJ-H |
2016-06-13 09:30 |
Tokyo |
|
Image recognition based on discriminative models using features generated from separable lattice HMMs Yoshinari Tsuzuki, Kei Sawada, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) PRMU2016-36 SP2016-2 WIT2016-2 |
One of the major problems in image recognition is degradation in the recognition performance caused by geometric variati... [more] |
PRMU2016-36 SP2016-2 WIT2016-2 pp.7-12 |
PRMU, CNR |
2016-02-21 14:00 |
Fukuoka |
|
Parameter sharing structures of separable lattice HMMs using mixture output distributions for image recognition Masato Sukegawa, Kei Sawada, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) PRMU2015-138 CNR2015-39 |
In image recognition systems, it is important to deal with geometrical variations such as size and location. Separable l... [more] |
PRMU2015-138 CNR2015-39 pp.37-42 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2015-12-03 09:00 |
Aichi |
Nagoya Inst of Tech. |
Evaluation of text-to-speech system construction for unknown-pronunciation languages Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2015-80 |
This paper discusses a method to construction of text-to-speech (TTS) systems for unknown-pronunciation languages. There... [more] |
SP2015-80 pp.93-98 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2015-12-03 10:20 |
Aichi |
Nagoya Inst of Tech. |
[Invited Talk]
Statistical speech synthesis: past, present and future Keiichi Tokuda (NITECH) |
[more] |
|
NLC, IPSJ-NL, SP, IPSJ-SLP, JSAI-SLUD (Joint) [detail] |
2014-12-15 14:00 |
Kanagawa |
Tokyo Institute of Technology (Suzukakedai Campus) |
[Invited Talk]
Statistical approach to flexible speech synthesis
-- towards human-like talking machines -- Keiichi Tokuda (NITech/Google) SP2014-109 |
This talk will give an overview of statistical approach to
flexible speech synthesis. For constructing human-like
tal... [more] |
SP2014-109 p.31 |
SP |
2014-01-23 16:00 |
Aichi |
Meijo Univ. |
Speaker recognition based on log-linear models using feature generation by variational Bayesian method Akifumi Tsuge, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2013-98 |
This paper presents a speaker recognition technique based on log-linear models (LLMs) using Bayesian statistics. Since d... [more] |
SP2013-98 pp.13-18 |
PRMU |
2013-02-22 09:30 |
Osaka |
|
Extended separable lattice HMMs based on state duration control for recognition of images with variations Takaya Makino, Shinji Takaki, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) PRMU2012-164 |
In this paper, an extension of separable lattice HMMs is described that (SL-HMM) introduces state duration control for d... [more] |
PRMU2012-164 pp.149-154 |
PRMU |
2013-02-22 10:00 |
Osaka |
|
Image recognition based on hidden Markov eigen-image models with the variational Bayesian method Kei Sawada, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) PRMU2012-165 |
This paper proposes an image recognition technique based on Hidden Markov Eigen-image Models (HMEMs) using the variation... [more] |
PRMU2012-165 pp.155-160 |
SP |
2012-06-14 16:00 |
Kanagawa |
NTT Atsugi R&D Center |
Perceptual evaluation of synthesized speech reflecting "personalities" Minoru Tsuzaki (KCUA), Keiichi Tokuda (NITEC), Hisashi Kawai (KDDI R&D Labs), Yoshinori Shiga, Jinfu Ni (NICT), Keiichiro Oura, Sayaka Shiota (NITEC) SP2012-39 |
Perceptual evaluation tests were performed for talker selection methods in the application of the speaker adaptation fra... [more] |
SP2012-39 pp.33-38 |
SP, NLC, IPSJ-SLP [detail] |
2011-12-20 13:00 |
Tokyo |
|
[Invited Talk]
Development of a framework for constructing spoken dialogue systems based on user-generated content Keiichi Tokuda (NITech) NLC2011-50 SP2011-95 |
This talk introduces a new JST CREST project ``uDialogue,''
involving Nagoya Institute of Technology and the Universi... [more] |
NLC2011-50 SP2011-95 pp.153-157 |