Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SIP, EA, SP, MI (Joint) [detail] |
2018-03-19 09:25 |
Okinawa |
|
Stable Estimation Method of Spatial Correlation Matrices for Multi-channel NMF Yuuki Tachioka (Denso IT Lab) EA2017-103 SIP2017-112 SP2017-86 |
Multi-channel non-negative matrix factorization (MNMF) achieves a high sound source separation performance but its initi... [more] |
EA2017-103 SIP2017-112 SP2017-86 pp.7-12 |
SIP, EA, SP, MI (Joint) [detail] |
2018-03-20 09:00 |
Okinawa |
|
[Poster Presentation]
Perceptual influence of spectral envelope and aperiodicity quantization for encoding high-quality speech Genta Miyashita, Masanori Morise (Univ. of Yamanashi) EA2017-145 SIP2017-154 SP2017-128 |
In this paper, we investigate the relationship between the degradation of sound quality and the parameter quantization i... [more] |
EA2017-145 SIP2017-154 SP2017-128 pp.241-244 |
EA |
2018-02-16 13:10 |
Hiroshima |
Pref. Univ. Hiroshima |
The effect of increasing the number of channels with multi-channel non-negative matrix factorization for noisy speech recognition Takanobu Uramoto (Oita Univ.), Youhei Okato, Toshiyuki Hanazawa (Mitsubishi Electric), Iori Miura, Shingo Uenohara, Ken'ich Furuya (Oita Univ.) EA2017-99 |
Nonnegative Matrix Factorization (NMF) factorizes a non-negative matrix into two non-negative matrices. In the field of ... [more] |
EA2017-99 pp.33-38 |
SP, ASJ-H |
2018-01-20 13:25 |
Tokyo |
The University of Tokyo |
A study on statistical speech synthesis based on GP-DNN hybrid model Tomoki Koriyama, Takao Kobayashi (Tokyo Tech) SP2017-67 |
We propose a novel approach to Gaussian process regression (GPR)-based speech synthesis
in this paper.
Since the conve... [more] |
SP2017-67 pp.5-10 |
SP, ASJ-H |
2018-01-20 13:50 |
Tokyo |
The University of Tokyo |
DNN Based Voice Conversion Method Considering Outputs of Multiple Networks Takuya Fujioka, Sun Qinghua (Hitachi) SP2017-68 |
In many conventional statistical voice conversion methods, the relations of source and target speech on all frames are e... [more] |
SP2017-68 pp.11-15 |
SP, ASJ-H |
2018-01-20 14:55 |
Tokyo |
The University of Tokyo |
[Poster Presentation]
Influence of frame shift in speech parameters on sound quality by high-quality speech analysis/synthesis system Genta Miyashita, Masanori Morise (Yamanashi Univ.) SP2017-72 |
Sound quality deterioration occurs when analyzing and synthesizing high--quality speech by using a vocoder.
We conduct ... [more] |
SP2017-72 pp.35-38 |
SP, SIP, EA |
2017-03-01 12:40 |
Okinawa |
Okinawa Industry Support Center |
[Poster Presentation]
An investigation of speaker adaptation method for DNN-based speech synthesis using speaker codes Nobukatsu Hojo, Yusuke Ijima (NTT) EA2016-108 SIP2016-163 SP2016-103 |
In this work, we conducted objective evaluation experiments on the conventional speaker adaptation methods for DNN-based... [more] |
EA2016-108 SIP2016-163 SP2016-103 pp.147-152 |
SP |
2017-01-21 16:35 |
Tokyo |
The University of Tokyo |
Simultaneous modeling of acoustic feature sequences and its temporal structures for DNN-based speech synthesis Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2016-76 |
In statistical parametric speech synthesis, a hidden Markov model (HMM) is widely used as an acoustic model. Recently, d... [more] |
SP2016-76 pp.71-76 |
EA, EMM |
2016-11-18 14:05 |
Oita |
Compal Hall (Oita) |
Basic Study of the Sound Quality Improvement using the Signal Enhancement of the Sound from inside of the Body Masatoshi Tsukiashi, Yoichi Midorikawa, Masanori Akita (Oita Univ.) EA2016-64 EMM2016-70 |
We have studied the detection of sleepiness or the detection of the change of feelings using NAM microphones. NAM microp... [more] |
EA2016-64 EMM2016-70 pp.95-100 |
WIT |
2016-10-16 15:35 |
Saga |
Karatsu Royal Hotel (Saga pref.) |
Study on subjective impressions and acoustic characteristics of elderly speech Shuhei Takemori, Mitsunori Mizumachi (KIT) WIT2016-38 |
There are a variety of works concerning subjective impressions of elderly speech, and hoarseness is commonly selected as... [more] |
WIT2016-38 pp.29-34 |
EA, ASJ-H, IPSJ-MUS [detail] |
2016-10-14 14:30 |
Ishikawa |
Noto Omakidai (Nanao) |
Study on speech transmission index using models of modulation transfer function Yuta Kashihara, Masashi Unoki (JAIST) EA2016-33 |
Room impluse response (RIR) is composed of three parts: direct sound, early reverberation, and late reverberation.
It ... [more] |
EA2016-33 pp.13-18 |
EA, ASJ-H |
2016-08-09 15:25 |
Miyagi |
Tohoku Gakuin Univ., Tagajo Campus |
EA2016-24 |
This talk describes the research achievements of the late Prof. Ken’iti Kido in speech research. In order to apply a spe... [more] |
EA2016-24 pp.25-30 |
SP |
2016-01-14 13:00 |
Kanagawa |
Sunpian Kawasaki |
[Invited Talk]
Articulatory controllable statistical parametric speech synthesis using EMA data Junichi Yamagishi (NII/Univ. Edinburgh) SP2015-88 |
This paper describes speech processing work in which articulator movements are used in conjunction with the acoustic spe... [more] |
SP2015-88 pp.19-24 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2015-12-02 11:15 |
Aichi |
Nagoya Inst of Tech. |
Evaluation and Analysis of Duration Correction for Non-Native Speech Based on Waveform Modification Shinya Kura, Shinnosuke Takamichi (NAIST), Tomoki Toda (NAIST/Nagoya Univ.), Graham Neubig, Sakriani Sakti, Satoshi Nakamura (NAIST) SP2015-73 |
There are several attempts at correcting durational patterns of non-native speech towards language learning. One of the ... [more] |
SP2015-73 pp.19-24 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2015-12-02 13:55 |
Aichi |
Nagoya Inst of Tech. |
Automation of high performance system building for large vocabulary speech recognition using evolution strategy with pareto optimality Takafumi Moriya, Tomohiro Tanaka, Takahiro Shinozaki (Tokyo Tech), Shinji Watanabe (MERL), Kevin Duh (NAIST) SP2015-75 |
The performance of speech recognition tasks can be significantly improved by the use of deep neural networks (DNN). Howe... [more] |
SP2015-75 pp.31-36 |
SP, IPSJ-SLP (Joint) |
2015-07-16 16:20 |
Nagano |
Katakura Suwako Hotel |
Sequence Discriminative Training for Low-Rank Deep Neural Networks Yuuki Tachioka (Mitsubishi Electric), Shinji Watanabe, Jonathan Le Roux, John Hershey (MERL) SP2015-39 |
Deep neural network (DNN) acoustic models outperform conventional Gaussian mixture model (GMM) but the number of paramet... [more] |
SP2015-39 pp.19-24 |
SP, IPSJ-SLP (Joint) |
2015-07-16 17:50 |
Nagano |
Katakura Suwako Hotel |
Speaker Adaptation Technique for Speech Recognition using a Feature Augmentation Framework Hiroshi Fujimura, Takashi Masuko (TOSHIBA) SP2015-42 |
Deep Neural Networks (DNNs) are powerful machine learning models.Nevertheless, the performance degrades for out-of domai... [more] |
SP2015-42 pp.37-42 |
SIP, EA, SP |
2015-03-02 11:40 |
Okinawa |
|
Optimization of impulse responses for model training in reverberant speech recognition Takahiro Fukumori, Masato Nakayama, Takanobu Nishiura, Yoichi Yamashita (Ritsumeikan Univ.) EA2014-78 SIP2014-119 SP2014-141 |
The reverberant speech degrades the speech recognition performance in the field of distant-talking speech. As one of app... [more] |
EA2014-78 SIP2014-119 SP2014-141 pp.37-42 |
HCS |
2015-01-30 17:00 |
Kagawa |
Bay Resort Hotel Shodoshima (Shodoshima, Kagaewa Pref.) |
[Poster Presentation]
Towards a Model that Represents Relationship Between Speech Parameters and Speakers' Emotional Impressions
-- Improvement of Conveyed Impressions by Controlling Speech Rate and Voice Pitch -- Takahiro Ono, Hiroto Saito, Hiroshi Kaneko, Naoki Mukawa (Tokyo Denki Univ.) HCS2014-107 |
Speech-rate conversion (SRC) which slows speakers' utterances improves speech intelligibility.At the same time, converte... [more] |
HCS2014-107 pp.193-198 |
HCGSYMPO (2nd) |
2014-12-17 - 2014-12-19 |
Yamaguchi |
Kaikyo Messe Shimonoseki |
Development of a wearable device for supporting verbal communication base on the prosody Yutaro Kajiwara (Univ. of Tsukuba), Kenji Suzuki (Univ. of Tsukuba/JST) |
This study proposes a wearable device for supporting verbal communication based on the description of prosody. Verbal co... [more] |
|