Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-02-29 10:10 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Noise-Robust Voice Conversion by Denoising Training Conditioned with Latent Variables of Speech Quality and Recording Environment Takuto Igarashi, Yuki Saito, Kentaro Seki, Shinnosuke Takamichi (UT), Ryuichi Yamamoto, Kentaro Tachibana (LY), Hiroshi Saruwatari (UT) EA2023-63 SIP2023-110 SP2023-45 |
In this paper, we propose noise-robust voice conversion by conditioning latent variables representing speech quality and... [more] |
EA2023-63 SIP2023-110 SP2023-45 pp.13-18 |
WIT, SP, IPSJ-SLP [detail] |
2023-10-14 16:40 |
Fukuoka |
Kyushu Institute of Technology (Primary: On-site, Secondary: Online) |
Sequence-to-sequence Voice Conversion for Electrolaryngeal Speech Enhancement with Multi-stage Pretraining and Fine-tuning Techniques Ding Ma, Lester Phillip Violeta, Kazuhiro Kobayashi, Tomoki Toda (Nagoya Univ.) SP2023-32 WIT2023-23 |
Sequence-to-sequence (seq2seq) voice conversion (VC) models have great potential for electrolaryngeal (EL) speech to nor... [more] |
SP2023-32 WIT2023-23 pp.27-32 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-23 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Impression Conversion of Speech for Unknown Speakers Using FaderNet Saki Kugimoto, Toru Nakashika (UEC) SP2023-2 |
This paper proposes a model that can convert impressions of unknown speakers who do not have impression labels, based on... [more] |
SP2023-2 pp.4-7 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-23 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
[Poster Presentation]
Opera-singing voice synthesis using Diff-SVC Aoto Sugahara (Kobe Univ.), Soma Kishimoto, Yuji Adachi, Kiyoto Tai (MEC Company Ltd.), Ryoichi Takashima, Testuya Takiguchi (Kobe Univ.) SP2023-7 |
Singing voice synthesis technology is widely used in the entertainment field, it has attracted attention as a method to ... [more] |
SP2023-7 pp.30-35 |
PRMU, IBISML, IPSJ-CVIM [detail] |
2023-03-03 16:50 |
Hokkaido |
Future University Hakodate (Primary: On-site, Secondary: Online) |
Parallel-Data-Free Japanese Singer Conversion using CycleGAN Considering Perceptual Loss in Singing Phoneme Sequences Kanade Gemmoto, Nobutaka Shimada, Tadashi Matsuo (Ritsumeikan Univ) PRMU2022-114 IBISML2022-121 |
This paper proposes a one-to-one Japanese Singing Voice Conversion (SVC) method without using parallel data.
Our method... [more] |
PRMU2022-114 IBISML2022-121 pp.293-298 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 10:40 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Diffusion-based parallel voice conversion with source-feature condition Takuya Kishida, Toru Nakashika (UEC) EA2022-107 SIP2022-151 SP2022-71 |
We propose a voice conversion method based on a diffusion probabilistic model trained on a parallel dataset. Since the d... [more] |
EA2022-107 SIP2022-151 SP2022-71 pp.191-196 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 11:20 |
Okinawa |
(Primary: On-site, Secondary: Online) |
An Investigation of Text-to-Speech Synthesis Using Voice Conversion and x-vector Embedding Sympathizing Emotion of Input Audio for Spoken Dialogue Systems Shunichi Kohara, Masanobu Abe, Sunao Hara (Okayama Univ.) EA2022-109 SIP2022-153 SP2022-73 |
In this paper, we propose a Text-to-Speech synthesis method to synthesize the same emotional expression as the input spe... [more] |
EA2022-109 SIP2022-153 SP2022-73 pp.203-208 |
EA, US (Joint) |
2022-12-22 16:50 |
Hiroshima |
Satellite Campus Hiroshima |
[Poster Presentation]
Data augmentation method for machine learning on speech data Tsubasa Maruyama (Tokyo Tech), Tsutomu Ikegami (AIST), Toshio Endo (Tokyo Tech), Takahiro Hirofuchi (AIST) EA2022-68 |
In machine learning, data augmentation is a method to enhance the number and diversity of data by adding transformations... [more] |
EA2022-68 pp.42-48 |
CCS |
2022-11-18 09:00 |
Mie |
(Primary: On-site, Secondary: Online) |
Voice Quality Conversion by Two-Step Process of Speech Feature Extraction and Speaker-Controlled Speech Synthesis Taichi Fukawa, Kenya Jin'no (Tokyo City Univ.) CCS2022-52 |
Many methods have been proposed in the field of voice quality conversion that use a style-transforming autoencoder. Howe... [more] |
CCS2022-52 pp.47-52 |
HCS |
2022-08-27 15:15 |
Hyogo |
(Primary: On-site, Secondary: Online) |
A Study of Feedback Methods for Speakers in Speech Rate Converted Conversation
-- Comparative evaluation for adaptive switching between audio feedback and visual feedback -- Kazuma Ban (Tokyo Denki Univ.), Hiroko Tokunaga (Tokyo Denki Univ./RIKEN), Naoki Mukawa, Hiroto Saito (Tokyo Denki Univ.) HCS2022-47 |
Speech rate conversion is a useful technique for people who need assistance in listening comprehension and non-native sp... [more] |
HCS2022-47 pp.61-66 |
NS, SR, RCS, SeMI, RCC (Joint) |
2022-07-13 14:50 |
Ishikawa |
The Kanazawa Theatre + Online (Primary: On-site, Secondary: Online) |
Investigation of noise removal using U-Net and voice recognition performance improvement
-- for train running noise -- Jian Lin, Shota Sano, Yuusuke Kawakita, Tsuyoshi Miyazaki, Hiroshi Tanaka (KAIT) SeMI2022-26 |
A method for converting noisy sound into images to remove the noise has been proposed. We are attempting to remove train... [more] |
SeMI2022-26 pp.34-39 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-17 15:00 |
Online |
Online |
Study of End-to-End Text-to-Speech that can seamlessly control speaker's individuality by Manipulating Speaker features Naoki Aotani, Sunao Hara, Msanobu Abe (Okayama Univ) SP2022-14 |
In this paper, we investigate an End-to-End speech synthesis scheme that enables to seamlessly control speaker individua... [more] |
SP2022-14 pp.55-60 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-18 10:50 |
Online |
Online |
[Invited Talk]
Crazy vocoder is unbreakable
-- But let's talk about an informal vision of the future -- Masanori Morise (Meiji Univ.) SP2022-15 |
When current speech synthesis researchers refer to Vocoder in their papers, they are most likely referring to Neural voc... [more] |
SP2022-15 pp.61-66 |
HCS |
2022-03-12 10:10 |
Online |
Online |
Evaluation of Feedback Methods for Speakers in Speech Rate Converted Conversation Tamami Mizuta, Hiroko Tokunaga, Naoki Mukawa, Hiroto Saito (Tokyo Denki Univ.) HCS2021-70 |
This study clarifies the characteristics of voice feedback and visual feedback, which are support functions for speakers ... [more] |
HCS2021-70 pp.55-60 |
WIT, IPSJ-AAC |
2022-03-08 10:55 |
Online |
Online |
A study on high-intelligibility speech synthesis of dysarthric speakers using voice conversion from normal speech and multi-speaker vocoder Tetsuro Takano (HTS), Takashi Nose, Aoi Kanagaki (Tohoku Univ.), Satoshi Watanabe (HTS) WIT2021-46 |
In this study, we investigated the possibility of generating intelligible synthetic speech by converting the voice of a ... [more] |
WIT2021-46 pp.18-23 |
EA, SIP, SP, IPSJ-SLP [detail] |
2022-03-02 11:35 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Study of Method for Improving Speech Intelligibility in Glossectomy Patients by Knowledge Distillation via Lip Features Kazushi Takashima, Masanobu Abe, Sunao Hara (Okayama Univ.) EA2021-81 SIP2021-108 SP2021-66 |
In this paper, we propose a voice conversion method for improving speech intelligibility uttered by glossectomy patients... [more] |
EA2021-81 SIP2021-108 SP2021-66 pp.108-113 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2021-12-03 10:30 |
Online |
Online |
An approach to voice conversion for manipulating emotion dimensions Keita Mukada, Hiroki Mori (Utsunomiya Univ.) NLC2021-25 SP2021-46 |
We propose an emotional voice conversion method based on the emotion dimensions. Conventional emotional voice conversion... [more] |
NLC2021-25 SP2021-46 pp.39-41 |
SP, IPSJ-SLP, IPSJ-MUS |
2021-06-19 15:00 |
Online |
Online |
Simulation of Body-conducted Speech and Synthesis of One's Own Voice with a Sound-proof Earmuff and Bone-conduction Microphones Chen Ruiyan, Nishimura Tazuko, Minematsu Nobuaki, Saito Daisuke (UTokyo) SP2021-15 |
When one hears his/her recorded voices for the first time, s/he is probably surprised and not rarely disappointed at the... [more] |
SP2021-15 pp.63-68 |
SP, IPSJ-SLP, IPSJ-MUS |
2021-06-19 15:00 |
Online |
Online |
Preliminary study on synthesizing relaxing voices
-- from a perspective of recognized/evoked emotions and acoustic features -- Yuki Watanabe, Shuichi Sakamoto (Tohoku Univ.), Takayuki Hoshi, Yoshiki Nagatani, Manabu Nakano (Pixie Dust Technologies) SP2021-19 |
The goal of this study is to synthesize speech sound which induces relaxed emotion. As the preliminary study, we investi... [more] |
SP2021-19 pp.85-90 |
SP, IPSJ-SLP, IPSJ-MUS |
2021-06-19 15:00 |
Online |
Online |
Unseen speaker's Voice Conversion by FaderNetVC with Speaker Feature Extractor Takumi Isako, Takuya Kishida, Toru Nakashika (UEC) SP2021-20 |
In recent years, many voice conversion models using Deep Neural Network (DNN) have been proposed, and FaderNetVC is one ... [more] |
SP2021-20 pp.91-96 |