 Results 1 - 20 of 105  
SP, IPSJ-MUS, IPSJ-SLP [detail] 2023-06-24
(Primary: On-site, Secondary: Online)
Effect of pause length ratio in speech length on the perception of speech rate induced by speech length
Maho Tamakawa, Shuichi Sakamoto (Tohoku Univ.) SP2023-23
The goal of this study is to investigate the mechanism of the perception of speech rate. In this preliminary study, we i... [more] SP2023-23
HIP, HCS, HI-SIGCE [detail] 2023-05-15
Okinawa Okinawa Industry Support Center
(Primary: On-site, Secondary: Online)
Cognitive Load Estimation of Speech-in-Noise Recall Task with State-Space Models
Mateusz Dubiel (, Minoru Nakayama (Tokyo Tech.), Xin Wang (NII) HCS2023-7 HIP2023-7
Cognitive workload during a listening and recall task was estimated using a state-space model based on metrics of pupill... [more] HCS2023-7 HIP2023-7
SP, IPSJ-SLP, EA, SIP [detail] 2023-02-28
(Primary: On-site, Secondary: Online)
End-to-End Speech Synthesis Based on Articulatory Movements Captured by Real-time MRI
Yuto Otani, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada (Tokyo Univ. Sci.) EA2022-77 SIP2022-121 SP2022-41
We propose an end-to-end deep learning model for speech synthesis based on articulatory movements captured by real-time ... [more] EA2022-77 SIP2022-121 SP2022-41
SP, WIT, IPSJ-SLP [detail] 2022-10-22
Kyoto Kyoto University
(Primary: On-site, Secondary: Online)
Conformer based early fusion model for audio-visual speech recognition
Nobukazu Aoki, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada (Tokyo Univ. of Sci.) SP2022-28 WIT2022-3
Previous studies of late fusion models with conformer encoders use independent encoders for both visual and audio inform... [more] SP2022-28 WIT2022-3
SP, IPSJ-MUS, IPSJ-SLP [detail] 2022-06-18
Online Online [Invited Talk] Crazy vocoder is unbreakable -- But let's talk about an informal vision of the future --
Masanori Morise (Meiji Univ.) SP2022-15
When current speech synthesis researchers refer to Vocoder in their papers, they are most likely referring to Neural voc... [more] SP2022-15
SP, IPSJ-MUS, IPSJ-SLP [detail] 2022-06-18
Online Online Speech intelligibility prediction of simulated hearing loss sounds using the Gammachirp Envelope Similarity Index (GESI) -- Subjective data from laboratory and crowdsourced remote experiments --
Toshio Irino, Honoka Tamaru, Ayako Yamamoto (Wakayama Univ.) SP2022-17
We aim at developing an objective intelligibility measure (OIM) to predict speech intelligibility (SI) for individual el... [more] SP2022-17
SP, WIT, IPSJ-SLP, ASJ-H [detail] 2021-10-19
Online Online A study on model training for DNN-HSMM-based speech synthesis using a large-scale speech corpus
Nobuyuki Nishizawa, Gen Hattori (KDDI Research) SP2021-34 WIT2021-27
In this study, an investigation into model training for DNN-HSMM-based speech synthesis using a large speech corpus coll... [more] SP2021-34 WIT2021-27
SP, IPSJ-SLP, IPSJ-MUS 2021-06-18
Online Online F0 estimation of speech based on l2-norm regularized TV-CAR analysis
Keiichi Funaki (Univ. of the Ryukyus) SP2021-2
Linear Prediction (LP) is the most successful speech analysis in speech processing, including speech coding implemented
... [more]
EA, US, SP, SIP, IPSJ-SLP [detail] 2021-03-03
Online Online [Invited Talk] *
Masahito Togami (LINE) EA2020-64 SIP2020-95 SP2020-29
Recently, deep learning based speech source separation has been evolved rapidly. A neural network (NN) is usually learne... [more] EA2020-64 SIP2020-95 SP2020-29
EA, US, SP, SIP, IPSJ-SLP [detail] 2021-03-03
Online Online [Poster Presentation] A unified source-filter network for neural vocoder
Reo Yoneyama, Yi-Chiao Wu, Tomoki Toda (Nagoya Univ.) EA2020-69 SIP2020-100 SP2020-34
In this paper, we propose a method to develop a neural vocoder using a single network based on the source-filter theory.... [more] EA2020-69 SIP2020-100 SP2020-34
SIS 2020-12-01
Online Online [Tutorial Lecture] A Theory for Controlling Musical Noise Based on Higher-Order Statistics
Ryoichi Miyazaki, Takuya Fujimura (NITTC) SIS2020-30
Although nonlinear speech enhancement methods can significantly eliminate background noise, it is known to generate musi... [more] SIS2020-30
SP 2020-01-29
Toyama   Application of Deep Gaussian Process to Multi-Speaker Text-to-Speech Synthesis using Speaker Codes
Kentaro Mitsui, Tomoki Koriyama, Hiroshi Saruwatari (UTokyo) SP2019-49
Speaker codes are widely used to achieve multi-speaker text-to-speech synthesis.
Conventionally, Deep Neural Network (D... [more]
HCS 2020-01-25
Oita Room407, J:COM HorutoHall OITA (Oita) the relationships between interpretation bias of indirect requests and learning rate parameters for social reward or punishment
Makoto Hirakawa (Hitoshima Univ.) HCS2019-61
People sometimes do not make requests directly. Instead, for example, the utterance “This room is cold” may attempt to c... [more] HCS2019-61
EA 2019-12-13
Fukuoka Kyushu Inst. Tech. Listening difficulty rating prediction model using STOI-type objective intelligibility index for outdoor public address speech
Keita Noguchi, Yosuke Kobayashi, Jay Kishigami (Muroran-IT), Kiyohiro Kurisu (TOA) EA2019-76
An outdoor public address (PA) system is indispensable for emergency broadcasting during the occurrence of disasters, an... [more] EA2019-76
Niigata FURINYA(Tsukioka-Onsen, Niigata) Adaptive Beamformer for Extracting Speech in Desired Direction Using Neural Soft-Mask
Yu Nakagome (Waseda Univ./LINE), Masahito Togami (LINE), Tetsunori Kobayashi (Waseda Univ.) SP2019-8
A multi-channel speech extraction guided by direction-of-arrival (DOA) estimation is addressed in this paper. A multi-ch... [more] SP2019-8
EA, SIP, SP 2019-03-15
Nagasaki i+Land nagasaki (Nagasaki-shi) [Poster Presentation] F0 estimation using TV-CAR speech analysis based on Regularized LP
Keiichi Funaki (Univ. of the Ryukyus) EA2018-152 SIP2018-158 SP2018-114
Linear Prediction (LP) analysis is speech analysis to estimate AR(Auto-Regressive) coefficients to represent the all-pol... [more] EA2018-152 SIP2018-158 SP2018-114
EA 2018-12-13
Fukuoka kyushu Univ. Relationship between Internal Parameters and Sound Quality in Biased Harmonic Regeneration Technique
Masakazu Une, Ryoichi Miyazaki (NITTC) EA2018-82
Harmonic Regeneration Noise Reduction~(HRNR) has been proposed to improve the speech distortion.
We introduced bias int... [more]
PRMU, SP 2018-06-29
Nagano   Analysis of speech-to-texture sentiment association characteristics
Win Thuzar Kyaw, Yoshinori Sagisaka (Waseda Univ.) PRMU2018-30 SP2018-10
Aiming at speech visualization using textures or finding texture generation scheme from sentiment information embedded i... [more] PRMU2018-30 SP2018-10
PRMU, SP 2018-06-29
Nagano   Speaker adaptation in speech synthesis based on neural networks including temporal structure modeling
Kento Nakao, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (NIT) PRMU2018-31 SP2018-11
This paper proposes a speaker adaptation technique for speech synthesis based on deep neural networks (DNNs) using a str... [more] PRMU2018-31 SP2018-11
(Joint) [detail]
Okinawa   Stable Estimation Method of Spatial Correlation Matrices for Multi-channel NMF
Yuuki Tachioka (Denso IT Lab) EA2017-103 SIP2017-112 SP2017-86
Multi-channel non-negative matrix factorization (MNMF) achieves a high sound source separation performance but its initi... [more] EA2017-103 SIP2017-112 SP2017-86
 Results 1 - 20 of 105  
