Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
EA, SIP, SP, IPSJ-SLP [detail] |
2025-03-03 09:45 |
Okinawa |
(Okinawa) |
[Poster Presentation]
Decentralized Independent Vector Analysis Based on Majorization-Minimization Algorithm for Distributed Microphone Arrays Katsuhiro Morita, Kouei Yamaoka, Norihiro Takamune, Hiroshi Saruwatari (UTokyo) |
[more] |
|
EA, SIP, SP, IPSJ-SLP [detail] |
2025-03-03 11:20 |
Okinawa |
(Okinawa) |
Proposal and Analysis of Metric for Evaluating Sampling Frequency Independence Based on Local Equivariance Error Kanami Imamura (UTokyo/AIST), Tomohiko Nakamura (AIST), Norihiro Takamune (UTokyo), Kouhei Yatabe (TUAT), Hiroshi Saruwatari (UTokyo) |
[more] |
|
EA, SIP, SP, IPSJ-SLP [detail] |
2025-03-03 13:54 |
Okinawa |
(Okinawa) |
Investigation of human perception based CLAPScore Taisei Takano, Yuki Okamoto, Yusuke Kanamori, Yuki Saito (UTokyo), Ryotaro Nagase (Ritsumeikan Univ.), Hiroshi Saruwatari (UTokyo) |
[more] |
|
EA, SIP, SP, IPSJ-SLP [detail] |
2025-03-03 14:08 |
Okinawa |
(Okinawa) |
Construction of subjective evaluation dataset for automatic evaluation of input-output relevance in text-to-audio Yusuke Kanamori, Yuki Okamoto, Taisei Takano (UTokyo), Shinnosuke Takamichi (Keio Univ./UTokyo), Yuki Saito, Hiroshi Saruwatari (UTokyo) |
[more] |
|
EA, SIP, SP, IPSJ-SLP [detail] |
2025-03-03 14:22 |
Okinawa |
(Okinawa) |
Online Processing for Spatial Voice Conversion Using BSS, VC, and Remixing Kenta Takada, Kentaro Seki, Yuki Saito, Kouei Yamaoka, Yuto Ishikawa, Hiroshi Saruwatari (UTokyo) |
[more] |
|
EA, SIP, SP, IPSJ-SLP [detail] |
2025-03-04 11:05 |
Okinawa |
(Okinawa) |
[Poster Presentation]
Noise Self-Supervised Rank-Constrained Spatial Covariance Matrix Estimation Using Independent Deeply Learned Matrix Analysis for Real-Time Multichannel Speech Extraction in Diffuse Noise Environment Yuki Nakanishi, Yuto Ishikawa, Norihiro Takamune, Hiroshi Saruwatari (The Univ. of Tokyo) |
[more] |
|
EA, US (Joint) |
2024-12-20 13:45 |
Oita |
OITAUniv (Oita) |
[Poster Presentation]
Investigation of Projection Back in Spatially Regularized Independent Low-Rank Matrix Analysis Using Simultaneous Steering Vector Estimation Sota Hirata, Norihiro Takamune, Kouei Yamaoka (The University of Tokyo), Daichi Kitamura (NIT, Kagawa), Hiroshi Saruwatari (The University of Tokyo), Yu Takahashi, Kazunobu Kondo (Yamaha Corp.) EA2024-68 |
Blind source separation (BSS) is a technique to separate each source signal from observed mixtures without any prior inf... [more] |
EA2024-68 pp.26-33 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-02-29 10:10 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
Noise-Robust Voice Conversion by Denoising Training Conditioned with Latent Variables of Speech Quality and Recording Environment Takuto Igarashi, Yuki Saito, Kentaro Seki, Shinnosuke Takamichi (UT), Ryuichi Yamamoto, Kentaro Tachibana (LY), Hiroshi Saruwatari (UT) EA2023-63 SIP2023-110 SP2023-45 |
In this paper, we propose noise-robust voice conversion by conditioning latent variables representing speech quality and... [more] |
EA2023-63 SIP2023-110 SP2023-45 pp.13-18 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-02-29 10:30 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
Accelerating and stabilizing vectorwise coordinate descent for spatially regularized independent low-rank matrix analysis Yuto Ishikawa, Takuya Okubo, Norihiro Takamune (UTokyo), Tomohiko Nakamura (AIST), Daichi Kitamura (NIT Kagawa), Hiroshi Saruwatari (UTokyo), Yu Takahashi, Kazunobu Kondo (Yamaha) EA2023-68 SIP2023-115 SP2023-50 |
Spatially regularized independent low-rank matrix analysis (SR-ILRMA) is the method that introduces the spatial prior in... [more] |
EA2023-68 SIP2023-115 SP2023-50 pp.43-50 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-02-29 15:30 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
Adaptation of End-to-End Japanese Speech Synthesis Using Crowdsoursed Dialect Accent Labels Yuki Oda, Kazuki Yamauchi, Yuki Saito, Hiroshi Saruwatari (UTokyo) |
[more] |
|
SIP, SP, EA, IPSJ-SLP [detail] |
2024-02-29 15:35 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
SRC4VC: Smartphone-Recorded Corpus for Benchmarking Multi-Speaker Voice Conversion Models Yuki Saito, Takuto Igarashi, Kentaro Seki, Shinnosuke Takamichi (UT), Ryuichi Yamamoto, Kentaro Tachibana (LY), Hiroshi Saruwatari (UT) |
[more] |
|
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 09:30 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
Multi-Dialect Speech Synthesis with Interpretable Accent latent Variable based on VQ-VAE Kazuki Yamauchi, Yuki Saito, Hiroshi Saruwatari (UTokyo) EA2023-98 SIP2023-145 SP2023-80 |
In this paper, we address two tasks: "Intra-dialect Text-to-Speech (TTS)," aiming to synthesize speech in the same diale... [more] |
EA2023-98 SIP2023-145 SP2023-80 pp.220-225 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 16:05 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
Evaluating speech generation based on objective measures for text generation Takaaki Saeki (UTokyo), Soumi Maiti (CMU), Shinnosuke Takamichi (UTokyo), Shinji Watanabe (CMU), Hiroshi Saruwatari (UTokyo) EA2023-133 SIP2023-180 SP2023-115 |
In the evaluation of speech generation, while subjective judgments have long been the gold standard, objective metrics s... [more] |
EA2023-133 SIP2023-180 SP2023-115 pp.421-426 |
EA, US (Joint) |
2023-12-22 13:00 |
Fukuoka |
(Fukuoka) |
[Poster Presentation]
Multichannel Blind Source Separation Using Independent Low-Rank Matrix Analysis with Observed-Signal-Dependent Regularization Based on Spectrogram Consistency Takaaki Kojima, Norihiro Takamune, Sota Misawa (UTokyo), Daichi Kitamura (NIT,Kagawa), Hiroshi Saruwatari (UTokyo) EA2023-51 |
Independent low-rank matrix analysis (ILRMA) is the state-of-the-art technique for blind source separation under the ove... [more] |
EA2023-51 pp.13-20 |
NLC, IPSJ-NL |
2023-03-18 16:40 |
Okinawa |
OIST (Okinawa, Online) (Primary: On-site, Secondary: Online) |
Collection of Textual Expressions in the Wild Toward Voice-quality Control from Free Description Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Hiroshi Saruwatari (UTokyo) NLC2022-29 |
[more] |
NLC2022-29 pp.55-60 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-02-28 16:15 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
Visual onoma-to-wave: environmental sound synthesis from visual onomatopoeias and sound-source images Hien Ohnaka (NITTC), Shinnosuke Takamichi (UT), Keisuke Imoto (DU), Yuki Okamoto (Rits), Kazuki Fujii, Hiroshi Saruwatari (UT) EA2022-90 SIP2022-134 SP2022-54 |
(To be available after the conference date) [more] |
EA2022-90 SIP2022-134 SP2022-54 pp.83-88 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 09:50 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
Regularization Term Design Based on Spectrogram Consistency in Independent Low-Rank Matrix Analysis for Multichannel Audio Source Separation Sota Misawa, Norihiro Takamune (UTokyo), Kohei Yatabe (TUAT), Daichi Kitamura (NIT, Kagawa), Hiroshi Saruwatari (UTokyo) EA2022-105 SIP2022-149 SP2022-69 |
It is known that block permutation occurs in the separated signals obtained by independent low-rank matrix analysis. Rec... [more] |
EA2022-105 SIP2022-149 SP2022-69 pp.177-184 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 11:00 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
Representation and Prediction of Accent Phrase Prosodic Features in Japanese Text-to-Speech Masaki Sato, Shinnosuke Takamichi, Hiroshi Saruwatari (The Univ. of Tokyo) EA2022-108 SIP2022-152 SP2022-72 |
In order to use speech synthesis in a variety of situations such as dialogue systems and emotional expression in audiobo... [more] |
EA2022-108 SIP2022-152 SP2022-72 pp.197-202 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 14:50 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
Corpus construction toward multi-domain empathetic dialogue speech synthesis Yuki Saito, Eiji Iimori, Shinnosuke Takamichi (UT), Kentaro Tachibana (LINE), Hiroshi Saruwatari (UT) |
(To be available after the conference date) [more] |
|
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-17 13:00 |
Online |
Online (Online) |
SP2022-6 |
Rank-constrained spatial covariance matrix estimation (RCSCME) is a method for blind speech extraction. In RCSCME, we de... [more] |
SP2022-6 pp.18-23 |