Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-02-29 10:10 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Noise-Robust Voice Conversion by Denoising Training Conditioned with Latent Variables of Speech Quality and Recording Environment Takuto Igarashi, Yuki Saito, Kentaro Seki, Shinnosuke Takamichi (UT), Ryuichi Yamamoto, Kentaro Tachibana (LY), Hiroshi Saruwatari (UT) EA2023-63 SIP2023-110 SP2023-45 |
In this paper, we propose noise-robust voice conversion by conditioning latent variables representing speech quality and... [more] |
EA2023-63 SIP2023-110 SP2023-45 pp.13-18 |
CNR, BioX |
2024-03-01 09:30 |
Tokyo |
NHK Science & Technology Research Laboratories (Primary: On-site, Secondary: Online) |
Respiration-enhanced Human-Robot Interaction Takao Obi, Kotaro Funakoshi (Tokyo Tech.) BioX2023-75 CNR2023-42 |
In the field of Human-Robot Interaction (HRI), enhancing a robot's impression, affinity, and interaction smoothness is c... [more] |
BioX2023-75 CNR2023-42 pp.30-34 |
WIT, SP, IPSJ-SLP [detail] |
2023-10-14 16:40 |
Fukuoka |
Kyushu Institute of Technology (Primary: On-site, Secondary: Online) |
Sequence-to-sequence Voice Conversion for Electrolaryngeal Speech Enhancement with Multi-stage Pretraining and Fine-tuning Techniques Ding Ma, Lester Phillip Violeta, Kazuhiro Kobayashi, Tomoki Toda (Nagoya Univ.) SP2023-32 WIT2023-23 |
Sequence-to-sequence (seq2seq) voice conversion (VC) models have great potential for electrolaryngeal (EL) speech to nor... [more] |
SP2023-32 WIT2023-23 pp.27-32 |
AI |
2023-09-12 15:55 |
Hokkaido |
|
Estimation of unmasked face images based on voice and 3DMM Tetsumaru Akatsuka, Ryohei Orihara, Yuichi Sei, Yasuyuki Tahara, Akihiko Ohsuga (UEC) AI2023-32 |
Facemasks have become common due to the COVID-19 pandemic. They have begun to affect security and identification systems... [more] |
AI2023-32 pp.187-193 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-23 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
[Poster Presentation]
Opera-singing voice synthesis using Diff-SVC Aoto Sugahara (Kobe Univ.), Soma Kishimoto, Yuji Adachi, Kiyoto Tai (MEC Company Ltd.), Ryoichi Takashima, Testuya Takiguchi (Kobe Univ.) SP2023-7 |
Singing voice synthesis technology is widely used in the entertainment field, it has attracted attention as a method to ... [more] |
SP2023-7 pp.30-35 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-23 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
[Poster Presentation]
Parody Detection Based on Alignment Collapse Between Lyrics and Singing Voice Tomoki Ariga, Yosuke Higuchi (Waseda Univ.), Mitsunori Kanno, Rie Shigyo, Takato Mizuguchi, Naoki Okamoto (DAIICHIKOSHO), Tetsuji Ogawa (Waseda Univ.) SP2023-10 |
We propose a parody detection system for karaoke singing by evaluating alignment collapse between lyrics and singing voi... [more] |
SP2023-10 pp.48-53 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-02-28 10:10 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Singing voice synthesis based on a frame-driven attention mechanism considering vocal timing deviation Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda (NITech) EA2022-78 SIP2022-122 SP2022-42 |
This paper proposes singing voice synthesis (SVS) based on a frame-driven attention mechanism considering vocal timing d... [more] |
EA2022-78 SIP2022-122 SP2022-42 pp.19-24 |
CCS |
2022-11-18 09:00 |
Mie |
(Primary: On-site, Secondary: Online) |
Voice Quality Conversion by Two-Step Process of Speech Feature Extraction and Speaker-Controlled Speech Synthesis Taichi Fukawa, Kenya Jin'no (Tokyo City Univ.) CCS2022-52 |
Many methods have been proposed in the field of voice quality conversion that use a style-transforming autoencoder. Howe... [more] |
CCS2022-52 pp.47-52 |
NS, SR, RCS, SeMI, RCC (Joint) |
2022-07-13 14:50 |
Ishikawa |
The Kanazawa Theatre + Online (Primary: On-site, Secondary: Online) |
Investigation of noise removal using U-Net and voice recognition performance improvement
-- for train running noise -- Jian Lin, Shota Sano, Yuusuke Kawakita, Tsuyoshi Miyazaki, Hiroshi Tanaka (KAIT) SeMI2022-26 |
A method for converting noisy sound into images to remove the noise has been proposed. We are attempting to remove train... [more] |
SeMI2022-26 pp.34-39 |
ICM |
2022-07-07 13:25 |
Hokkaido |
Tokachi Plaza (Primary: On-site, Secondary: Online) |
A study on Voice Quality Deterioration Monitoring Using OSS for Service Monitoring Yasuhiro Onozuka, Akihiro Shibata, Yoshitaka Syuntou, Kozo Sakae (DOCOMO Technology) ICM2022-11 |
In recent years, stable voice calling services have been demanded to the increased use of calls by corporate users. Ther... [more] |
ICM2022-11 pp.7-10 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-17 15:00 |
Online |
Online |
Study of End-to-End Text-to-Speech that can seamlessly control speaker's individuality by Manipulating Speaker features Naoki Aotani, Sunao Hara, Msanobu Abe (Okayama Univ) SP2022-14 |
In this paper, we investigate an End-to-End speech synthesis scheme that enables to seamlessly control speaker individua... [more] |
SP2022-14 pp.55-60 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-18 10:50 |
Online |
Online |
[Invited Talk]
Crazy vocoder is unbreakable
-- But let's talk about an informal vision of the future -- Masanori Morise (Meiji Univ.) SP2022-15 |
When current speech synthesis researchers refer to Vocoder in their papers, they are most likely referring to Neural voc... [more] |
SP2022-15 pp.61-66 |
CQ, CBE (Joint) |
2022-01-27 17:20 |
Ishikawa |
Kanazawa(Ishikawa Pref.) (Primary: On-site, Secondary: Online) |
Proposal and Validation of Packet Layer Quality Estimation Model for Voice Call Applications Itsuki Okada, Takanori Hayashi (HIT) CQ2021-86 |
Quality visualization and quality control during service are important to ensure that voice call applications are used w... [more] |
CQ2021-86 pp.56-61 |
SP, IPSJ-SLP, IPSJ-MUS |
2021-06-19 15:00 |
Online |
Online |
Simulation of Body-conducted Speech and Synthesis of One's Own Voice with a Sound-proof Earmuff and Bone-conduction Microphones Chen Ruiyan, Nishimura Tazuko, Minematsu Nobuaki, Saito Daisuke (UTokyo) SP2021-15 |
When one hears his/her recorded voices for the first time, s/he is probably surprised and not rarely disappointed at the... [more] |
SP2021-15 pp.63-68 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 14:05 |
Online |
Online |
[Poster Presentation]
Psychological evaluation of popping-out voice quality Takashi Nakao, Tatsuya Kitamura (Konan Univ.) EA2020-72 SIP2020-103 SP2020-37 |
The present study evaluated the "popping-out" voice quality through auditory tests using a semantic differential method.... [more] |
EA2020-72 SIP2020-103 SP2020-37 pp.74-78 |
MVE, IMQ, IE, CQ (Joint) [detail] |
2021-03-02 15:30 |
Online |
Online |
A Study on Packet Layer Quality Evaluation Model for Voice Call Application Itsuki Okada, Tomoya Seno, Takanori Hayashi (Hiroshima Institute of Technology) CQ2020-114 |
Quality visualization and quality control during service are important to ensure that voice call applications are used w... [more] |
CQ2020-114 pp.34-37 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2020-12-03 16:50 |
Online |
Online |
Stabilize Fundamental Frequency of StarGAN based Voice Conversion Masashi Kimura (ConLab.), Hideyuki Kasuga (ZIKU) NLC2020-20 SP2020-23 |
Virtual Youtuber and Virtual Influencer is getting attention, which is video streamers with avator appearence createdcom... [more] |
NLC2020-20 SP2020-23 pp.34-37 |
WIT |
2020-06-12 13:30 |
Online |
Online |
Improving the pronounce clarity of dysarthric speech using CycleGAN Shuhei Imai, Takashi Nose, Aoi Kanagaki (Tohoku Univ.), Satoshi Watanabe (HTS), Akinori Ito (Tohoku Univ.) WIT2020-1 |
Several voice conversion systems have been developed that converts the dysarthric speech into healthy speech.The convent... [more] |
WIT2020-1 pp.1-6 |
WIT, SP |
2019-10-26 16:20 |
Kagoshima |
Daiichi Institute of Technology |
Application of interactive and real-time tools to acoustic analysis for assessment of dysphonia Hideki Kawahara (Wakayama Univ.), Ken-Ichi Sakakibara (Health Science Univ. Hokkaido), Kenta Wakasa (ATLUS), Hiroko Terasawa (Univ. Tsukuba) SP2019-24 WIT2019-23 |
We introduce a real-time and interactive tool for visualizing voice-source attributes. The tool provides a simplified ca... [more] |
SP2019-24 WIT2019-23 pp.39-43 |
RCS, SAT (Joint) |
2019-08-23 09:30 |
Aichi |
Nagoya University |
Comparison of Speech Quality between Voice Communication System based on Communication Satellite Network and Satellite Mobile Phone Byeongyo Jeong, Ryouichi Nishimura, Hajime Susukita, Takashi Takahashi (NICT) SAT2019-32 |
The mobile networks become almost useless under large-scale disasters due to physical damage of mobile base stations, el... [more] |
SAT2019-32 pp.79-83 |