Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-24 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Effect of pause length ratio in speech length on the perception of speech rate induced by speech length Maho Tamakawa, Shuichi Sakamoto (Tohoku Univ.) SP2023-23 |
The goal of this study is to investigate the mechanism of the perception of speech rate. In this preliminary study, we i... [more] |
SP2023-23 pp.114-118 |
HIP, HCS, HI-SIGCE [detail] |
2023-05-15 10:20 |
Okinawa |
Okinawa Industry Support Center (Primary: On-site, Secondary: Online) |
Cognitive Load Estimation of Speech-in-Noise Recall Task with State-Space Models Mateusz Dubiel (uni.lu), Minoru Nakayama (Tokyo Tech.), Xin Wang (NII) HCS2023-7 HIP2023-7 |
Cognitive workload during a listening and recall task was estimated using a state-space model based on metrics of pupill... [more] |
HCS2023-7 HIP2023-7 pp.29-32 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-02-28 09:50 |
Okinawa |
(Primary: On-site, Secondary: Online) |
End-to-End Speech Synthesis Based on Articulatory Movements Captured by Real-time MRI Yuto Otani, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada (Tokyo Univ. Sci.) EA2022-77 SIP2022-121 SP2022-41 |
We propose an end-to-end deep learning model for speech synthesis based on articulatory movements captured by real-time ... [more] |
EA2022-77 SIP2022-121 SP2022-41 pp.13-18 |
SP, WIT, IPSJ-SLP [detail] |
2022-10-22 15:40 |
Kyoto |
Kyoto University (Primary: On-site, Secondary: Online) |
Conformer based early fusion model for audio-visual speech recognition Nobukazu Aoki, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada (Tokyo Univ. of Sci.) SP2022-28 WIT2022-3 |
Previous studies of late fusion models with conformer encoders use independent encoders for both visual and audio inform... [more] |
SP2022-28 WIT2022-3 pp.8-13 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-18 10:50 |
Online |
Online |
[Invited Talk]
Crazy vocoder is unbreakable
-- But let's talk about an informal vision of the future -- Masanori Morise (Meiji Univ.) SP2022-15 |
When current speech synthesis researchers refer to Vocoder in their papers, they are most likely referring to Neural voc... [more] |
SP2022-15 pp.61-66 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-18 13:00 |
Online |
Online |
Speech intelligibility prediction of simulated hearing loss sounds using the Gammachirp Envelope Similarity Index (GESI)
-- Subjective data from laboratory and crowdsourced remote experiments -- Toshio Irino, Honoka Tamaru, Ayako Yamamoto (Wakayama Univ.) SP2022-17 |
We aim at developing an objective intelligibility measure (OIM) to predict speech intelligibility (SI) for individual el... [more] |
SP2022-17 pp.71-76 |
SP, WIT, IPSJ-SLP, ASJ-H [detail] |
2021-10-19 15:10 |
Online |
Online |
A study on model training for DNN-HSMM-based speech synthesis using a large-scale speech corpus Nobuyuki Nishizawa, Gen Hattori (KDDI Research) SP2021-34 WIT2021-27 |
In this study, an investigation into model training for DNN-HSMM-based speech synthesis using a large speech corpus coll... [more] |
SP2021-34 WIT2021-27 pp.52-57 |
SP, IPSJ-SLP, IPSJ-MUS |
2021-06-18 13:00 |
Online |
Online |
F0 estimation of speech based on l2-norm regularized TV-CAR analysis Keiichi Funaki (Univ. of the Ryukyus) SP2021-2 |
Linear Prediction (LP) is the most successful speech analysis in speech processing, including speech coding implemented
... [more] |
SP2021-2 pp.7-12 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 13:05 |
Online |
Online |
[Invited Talk]
* Masahito Togami (LINE) EA2020-64 SIP2020-95 SP2020-29 |
Recently, deep learning based speech source separation has been evolved rapidly. A neural network (NN) is usually learne... [more] |
EA2020-64 SIP2020-95 SP2020-29 pp.27-32 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 14:05 |
Online |
Online |
[Poster Presentation]
A unified source-filter network for neural vocoder Reo Yoneyama, Yi-Chiao Wu, Tomoki Toda (Nagoya Univ.) EA2020-69 SIP2020-100 SP2020-34 |
In this paper, we propose a method to develop a neural vocoder using a single network based on the source-filter theory.... [more] |
EA2020-69 SIP2020-100 SP2020-34 pp.57-62 |
SIS |
2020-12-01 11:25 |
Online |
Online |
[Tutorial Lecture]
A Theory for Controlling Musical Noise Based on Higher-Order Statistics Ryoichi Miyazaki, Takuya Fujimura (NITTC) SIS2020-30 |
Although nonlinear speech enhancement methods can significantly eliminate background noise, it is known to generate musi... [more] |
SIS2020-30 pp.18-23 |
SP |
2020-01-29 11:30 |
Toyama |
|
Application of Deep Gaussian Process to Multi-Speaker Text-to-Speech Synthesis using Speaker Codes Kentaro Mitsui, Tomoki Koriyama, Hiroshi Saruwatari (UTokyo) SP2019-49 |
Speaker codes are widely used to achieve multi-speaker text-to-speech synthesis.
Conventionally, Deep Neural Network (D... [more] |
SP2019-49 pp.31-36 |
HCS |
2020-01-25 14:00 |
Oita |
Room407, J:COM HorutoHall OITA (Oita) |
the relationships between interpretation bias of indirect requests and learning rate parameters for social reward or punishment Makoto Hirakawa (Hitoshima Univ.) HCS2019-61 |
People sometimes do not make requests directly. Instead, for example, the utterance “This room is cold” may attempt to c... [more] |
HCS2019-61 pp.41-45 |
EA |
2019-12-13 11:20 |
Fukuoka |
Kyushu Inst. Tech. |
Listening difficulty rating prediction model using STOI-type objective intelligibility index for outdoor public address speech Keita Noguchi, Yosuke Kobayashi, Jay Kishigami (Muroran-IT), Kiyohiro Kurisu (TOA) EA2019-76 |
An outdoor public address (PA) system is indispensable for emergency broadcasting during the occurrence of disasters, an... [more] |
EA2019-76 pp.71-78 |
SP, IPSJ-SLP (Joint) |
2019-07-20 13:00 |
Niigata |
FURINYA(Tsukioka-Onsen, Niigata) |
Adaptive Beamformer for Extracting Speech in Desired Direction Using Neural Soft-Mask Yu Nakagome (Waseda Univ./LINE), Masahito Togami (LINE), Tetsunori Kobayashi (Waseda Univ.) SP2019-8 |
A multi-channel speech extraction guided by direction-of-arrival (DOA) estimation is addressed in this paper. A multi-ch... [more] |
SP2019-8 pp.9-14 |
EA, SIP, SP |
2019-03-15 13:30 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
[Poster Presentation]
F0 estimation using TV-CAR speech analysis based on Regularized LP Keiichi Funaki (Univ. of the Ryukyus) EA2018-152 SIP2018-158 SP2018-114 |
Linear Prediction (LP) analysis is speech analysis to estimate AR(Auto-Regressive) coefficients to represent the all-pol... [more] |
EA2018-152 SIP2018-158 SP2018-114 pp.311-316 |
EA |
2018-12-13 13:25 |
Fukuoka |
kyushu Univ. |
Relationship between Internal Parameters and Sound Quality in Biased Harmonic Regeneration Technique Masakazu Une, Ryoichi Miyazaki (NITTC) EA2018-82 |
Harmonic Regeneration Noise Reduction~(HRNR) has been proposed to improve the speech distortion.
We introduced bias int... [more] |
EA2018-82 pp.7-14 |
PRMU, SP |
2018-06-29 10:30 |
Nagano |
|
Analysis of speech-to-texture sentiment association characteristics Win Thuzar Kyaw, Yoshinori Sagisaka (Waseda Univ.) PRMU2018-30 SP2018-10 |
Aiming at speech visualization using textures or finding texture generation scheme from sentiment information embedded i... [more] |
PRMU2018-30 SP2018-10 pp.47-52 |
PRMU, SP |
2018-06-29 11:00 |
Nagano |
|
Speaker adaptation in speech synthesis based on neural networks including temporal structure modeling Kento Nakao, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (NIT) PRMU2018-31 SP2018-11 |
This paper proposes a speaker adaptation technique for speech synthesis based on deep neural networks (DNNs) using a str... [more] |
PRMU2018-31 SP2018-11 pp.53-58 |
SIP, EA, SP, MI (Joint) [detail] |
2018-03-19 09:25 |
Okinawa |
|
Stable Estimation Method of Spatial Correlation Matrices for Multi-channel NMF Yuuki Tachioka (Denso IT Lab) EA2017-103 SIP2017-112 SP2017-86 |
Multi-channel non-negative matrix factorization (MNMF) achieves a high sound source separation performance but its initi... [more] |
EA2017-103 SIP2017-112 SP2017-86 pp.7-12 |