Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] |
2023-12-03 11:05 |
Tokyo |
Kikai-Shinko-Kaikan Bldg. (Primary: On-site, Secondary: Online) |
[Poster Presentation]
Self-supervised learning model based emotion transfer and intensity control technology for expressive speech synthesis Wei Li, Nobuaki Minematsu, Daisuke Saito (Univ. of Tokyo) NLC2023-21 SP2023-41 |
Emotion transfer techniques, which transfersba the speaking style from the reference speech to the target speech, are wi... [more] |
NLC2023-21 SP2023-41 pp.43-48 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-23 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
[Poster Presentation]
MS-Harmonic-Net++ vs SiFi-GAN: Comparison of fundamental frequency controllable fast neural waveform generative models. Sota Shimizu (Kobe Univ./NICT), Takuma Okamoto (NICT), Ryoichi Takashima (Kobe Univ.), Yamato Ohtani (NICT), Tetsuya Takiguchi (Kobe Univ.), Tomoki Toda (Nagoya Univ./NICT), Hisashi Kawai (NICT) SP2023-5 |
Although Harmonic-Net+ has been proposed as a fundamental frequency (fo) and speech rate (SR) controllable fast neural v... [more] |
SP2023-5 pp.20-25 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 10:10 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Vocabulary-Set Decomposition and Multi-task Learning for Target Vocabulary Extraction in Japanese Speech Recognition Aoi Ito (LINE/Hosei Univ.), Tatsuya Komatsu, Yusuke Fujita (LINE) EA2022-102 SIP2022-146 SP2022-66 |
This paper proposes a target vocabulary extraction method for Japanese speech recognition models based on vocabulary set... [more] |
EA2022-102 SIP2022-146 SP2022-66 pp.159-164 |
CCS |
2022-11-18 09:00 |
Mie |
(Primary: On-site, Secondary: Online) |
Voice Quality Conversion by Two-Step Process of Speech Feature Extraction and Speaker-Controlled Speech Synthesis Taichi Fukawa, Kenya Jin'no (Tokyo City Univ.) CCS2022-52 |
Many methods have been proposed in the field of voice quality conversion that use a style-transforming autoencoder. Howe... [more] |
CCS2022-52 pp.47-52 |
SP, WIT, IPSJ-SLP [detail] |
2022-10-22 15:40 |
Kyoto |
Kyoto University (Primary: On-site, Secondary: Online) |
Conformer based early fusion model for audio-visual speech recognition Nobukazu Aoki, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada (Tokyo Univ. of Sci.) SP2022-28 WIT2022-3 |
Previous studies of late fusion models with conformer encoders use independent encoders for both visual and audio inform... [more] |
SP2022-28 WIT2022-3 pp.8-13 |
EA, SIP, SP, IPSJ-SLP [detail] |
2022-03-01 14:45 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Target speaker extraction based on conditional variational autoencoder and directional information in underdetermined condition Rui Wang, Li Li, Tomoki Toda (Nagoya Univ) EA2021-76 SIP2021-103 SP2021-61 |
This paper deals with a dual-channel target speaker extraction problem in underdetermined conditions. A blind source sep... [more] |
EA2021-76 SIP2021-103 SP2021-61 pp.76-81 |
WIT, SP |
2019-10-27 09:00 |
Kagoshima |
Daiichi Institute of Technology |
Extraction of linguistic representation and syllable recognition from EEG signal of speech-imagery Kentaro Fukai, Hidefumi Ohmura, Kouichi Katsurada (Tokyo Univ. of Science), Satoka Hirata, Yurie Iribe (Aichi Prefectural Univ.), Mingchua Fu, Ryo Taguchi (Nagoya Inst. of Technology), Tsuneo Nitta (Waseda Univ./Toyohashi Univ. of Technology) SP2019-28 WIT2019-27 |
Speech imagery recognition from Electroencephalogram (EEG) is one of the challenging technologies for non-invasive brain... [more] |
SP2019-28 WIT2019-27 pp.63-68 |
WIT, SP |
2019-10-27 10:30 |
Kagoshima |
Daiichi Institute of Technology |
A Method to Reduce Ambiguity in Identifying the Muscle Activation Time of Each EMG Channel in Isolated Inaudible Single Syllable Recognition Hidetoshi Nagai (KIT) SP2019-32 WIT2019-31 |
In inaudible speech recognition using surface EMG, consonant recognition is one of the difficult problems. When phonemes... [more] |
SP2019-32 WIT2019-31 pp.87-92 |
SP, IPSJ-SLP (Joint) |
2019-07-20 13:00 |
Niigata |
FURINYA(Tsukioka-Onsen, Niigata) |
Adaptive Beamformer for Extracting Speech in Desired Direction Using Neural Soft-Mask Yu Nakagome (Waseda Univ./LINE), Masahito Togami (LINE), Tetsunori Kobayashi (Waseda Univ.) SP2019-8 |
A multi-channel speech extraction guided by direction-of-arrival (DOA) estimation is addressed in this paper. A multi-ch... [more] |
SP2019-8 pp.9-14 |
EA, SIP, SP |
2019-03-15 13:30 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
[Poster Presentation]
Design and Evaluation of Ladder Denoising Autoencoder for Auditory Speech Feature Extraction of Overlapped Speech Separation Hiroshi Sekiguchi, Yoshiaki Narusue, Hiroyuki Morikawa (Univ. of Tokyo) EA2018-155 SIP2018-161 SP2018-117 |
Primates and mammalian distinguish overlapped speech sounds from one another by recognizing a single sound source whethe... [more] |
EA2018-155 SIP2018-161 SP2018-117 pp.329-333 |
SP, IPSJ-SLP (Joint) |
2018-07-26 16:15 |
Shizuoka |
Sago-Royal-Hotel (Hamamatsu) |
Ladder Network Driven from Auditory Computational Model for Multi-talker Speech Separation Hiroshi Sekiguchi, Yoshiaki Narusue, Hiroyuki Morikawa (Univ. of Tokyo) SP2018-18 |
This paper introduces ladder network implementation induced by auditory computational model for multi-talker speech sepa... [more] |
SP2018-18 pp.9-13 |
SIP, EA, SP, MI (Joint) [detail] |
2018-03-19 13:00 |
Okinawa |
|
[Poster Presentation]
An Experimental Study on Segmental and Prosodic Comparison of Utterances for Automatic Assessment of Dubbing Speech Takuya Ozuru, Nobuaki Minematsu, Daisuke Saito (Univ. of Tokyo) EA2017-114 SIP2017-123 SP2017-97 |
In Japanese language education, especially in its speech training, dubbing-based training has gained a
huge popularity.... [more] |
EA2017-114 SIP2017-123 SP2017-97 pp.75-80 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2017-12-21 12:50 |
Tokyo |
Waseda Univ. Green Computing Systems Research Organization |
[Poster Presentation]
Realtime analysis and display of voice source periodicity Hideki Kawahara (Wakayama Univ.), Ken-Ichi Sakakibara (Health Sciences Univ. Hokkaido) SP2017-59 |
This article introduces a real-time procedure for extraction and display of deviations from pure periodicity in voice ex... [more] |
SP2017-59 pp.21-22 |
HCS, HIP, HI-SIGCOASTER [detail] |
2017-05-16 15:45 |
Okinawa |
Okinawa Industry Support Center |
Extraction of acoustic features of emotional speech and their characteristics Takashi Yamazaki, Minoru Nakayama (Tokyo Tech.) HCS2017-17 HIP2017-17 |
In this paper, we extracted the acoustic features of emotional speech and examined the effect of the feature on emotiona... [more] |
HCS2017-17 HIP2017-17 pp.127-130 |
SP, SIP, EA |
2017-03-02 09:00 |
Okinawa |
Okinawa Industry Support Center |
[Poster Presentation]
Hardware Speech Sensor Based on Deep Neural Network Feature Extractor and Template Matching Yi Liu, Boyu Qian, Jian Wang, Takahiro Shinozaki (Titech) EA2016-135 SIP2016-190 SP2016-130 |
We explore the possibility of combination of a DNN-based feature extractor and template based matching for keyword detec... [more] |
EA2016-135 SIP2016-190 SP2016-130 pp.297-300 |
EA, ASJ-H |
2016-08-09 13:30 |
Miyagi |
Tohoku Gakuin Univ., Tagajo Campus |
Improvement to an Objective Binaural Intelligibility Prediction Method Kazuya Taira, Kazuhiro Kondo (Yamagata Univ.) EA2016-21 |
We attempted to improve the binaural intelligibility estimation method proposed in our previous paper. The number of dat... [more] |
EA2016-21 pp.7-12 |
WIT |
2016-03-05 09:30 |
Ibaraki |
Tusukuba Univ. of Tech.(Tsukuba) |
Hearing Aid with Lip Reading
-- Speech Enhancement using Vowel Estimation -- Yuzuru Iinuma, Tetsuya Matsumoto (Nagoya Univ.), Yoshinori Takeuchi (Daido Univ.), Hiroaki Kudo, Noboru Ohnishi (Nagoya Univ.) WIT2015-98 |
Under highly noisy environments such as construction sites and cocktail parties, it is difficult for not only humans but... [more] |
WIT2015-98 pp.53-58 |
IN |
2016-01-21 14:25 |
Aichi |
Nagoya Kigyou Fukushi Kaikan |
Voice Actor Recognition Using Voice and Cast Information of Anime Video Motoki Eida, Shun Hattori (Muroran Inst. of Tech.) IN2015-96 |
When we hear a voice from amusement media such as animes, games, movies, and music, we sometimes feel like that we have ... [more] |
IN2015-96 pp.7-12 |
HCGSYMPO (2nd) |
2015-12-16 - 2015-12-18 |
Toyama |
Toyama International Conference Center |
Silent Speech BCI
-- An investigation for practical problems -- Shun Hirose (KIT), Hiromi Yamaguchi (NEC), Takashi Ito, Toshimasa Yamazaki (KIT), Shinichi Fukuzumi (NEC), Takahiro Yamanoi (Hokkai Gakuen Univ.) |
We have developed single-trial-EEG-based silent speech BCI (SSBCI) using speech signals. Our algorithm consisted of (1) ... [more] |
|
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2015-12-03 09:25 |
Aichi |
Nagoya Inst of Tech. |
Deep Auto-encoder based Low-dimensional Feature Extraction using FFT Spectral Envelopes in Statistical Parametric Speech Synthesis Shinji Takaki, Junichi Yamagishi (NII) SP2015-81 |
In the state-of-the-art statistical parametric speech synthesis system, a speech analysis module, e.g. STRAIGHT spectral... [more] |
SP2015-81 pp.99-104 |