Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] |
2023-12-03 10:00 |
Tokyo |
Kikai-Shinko-Kaikan Bldg. (Primary: On-site, Secondary: Online) |
Improvement of Tacotron2 text-to-speech model based on masking operation and positional attention mechanism Tong Ma, Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) NLC2023-17 SP2023-37 |
[more] |
NLC2023-17 SP2023-37 pp.19-24 |
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] |
2023-12-03 11:05 |
Tokyo |
Kikai-Shinko-Kaikan Bldg. (Primary: On-site, Secondary: Online) |
[Poster Presentation]
Integration of Throat Microphone Recording and Bandwidth Extension for Robust Assessment of L2 Listening Yu Xu, Nobuaki Minematsu, Daisuke Saito (Univ. of Tokyo) NLC2023-20 SP2023-40 |
In an active classroom, L2 assessment is often a challenging issue, since everyone in the crowded classroom can be a noi... [more] |
NLC2023-20 SP2023-40 pp.37-42 |
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] |
2023-12-03 11:05 |
Tokyo |
Kikai-Shinko-Kaikan Bldg. (Primary: On-site, Secondary: Online) |
[Poster Presentation]
Self-supervised learning model based emotion transfer and intensity control technology for expressive speech synthesis Wei Li, Nobuaki Minematsu, Daisuke Saito (Univ. of Tokyo) NLC2023-21 SP2023-41 |
Emotion transfer techniques, which transfersba the speaking style from the reference speech to the target speech, are wi... [more] |
NLC2023-21 SP2023-41 pp.43-48 |
WIT, SP, IPSJ-SLP [detail] |
2023-10-14 16:15 |
Fukuoka |
Kyushu Institute of Technology (Primary: On-site, Secondary: Online) |
Comparative study on different speaker embedding spaces focusing on the relation to perceptual inter-speaker similarity Wakuto Morita, Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) SP2023-31 WIT2023-22 |
This study examines the correspondence between inter-speaker similarity based on speaker embeddings and perceptual speak... [more] |
SP2023-31 WIT2023-22 pp.21-26 |
PN |
2023-08-29 16:40 |
Hokkaido |
(Primary: On-site, Secondary: Online) |
Capacity Enhancement of Resilient Optical Networks with Multi-band Virtual Bypass Links Daisuke Saito, Yojiro Mori (Nagoya Univ), Kohei Hosokawa, Shigeyuki Yanagimachi (NEC), Hiroshi Hasegawa (Nagoya Univ) PN2023-26 |
We propose a cost-effective capacity enhancement for resilient networks that adopt dedicated path protection. This enhan... [more] |
PN2023-26 pp.55-58 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 11:40 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Choral Singing Voice Synthesis with Modulation Acoustic Features Sora Miyazawa, Anan Kikuchi, Daisuke Saito, Nobuaki Minematsu (UTokyo) EA2022-110 SIP2022-154 SP2022-74 |
In this paper, we analyzed the sense of multipule singing focused on unison and implemented it for a singing voice
synt... [more] |
EA2022-110 SIP2022-154 SP2022-74 pp.209-214 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 11:40 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Predominant Instrument Recognition in Polyphonic Music Based on Transfer Learning with Vanilla ResNet-50 Lifan Zhong, Daisuke Saito, Nobuaki Minematsu (UTokyo) EA2022-114 SIP2022-158 SP2022-78 |
Instrument recognition is an active research field in MIR (Music Information Retrieval) and has great potential for real... [more] |
EA2022-114 SIP2022-158 SP2022-78 pp.232-237 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 16:50 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Effects of Voice Artificiality on the Degree of Compatibility between Voice and Appearance of Voice Agents Kota Iura, Naotake Masuda, Daisuke Saito, Nobuaki Minematsu (UTokyo) EA2022-121 SIP2022-165 SP2022-85 |
For a spoken agent such as interactive robots, it is important to use a voice that fits the image of the agent in terms ... [more] |
EA2022-121 SIP2022-165 SP2022-85 pp.264-269 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 17:10 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Quantification of Voice Register Information including Mixed Voice based on Class Posterior Probabilities Yu Kitamura, Anan Kikuchi, Daisuke Saito, Nobuaki Minematsu (UTokyo) EA2022-122 SIP2022-166 SP2022-86 |
Methods to distinguish between modal and falsetto have been proposed so far,
but there are few studies analyzing mixed ... [more] |
EA2022-122 SIP2022-166 SP2022-86 pp.270-275 |
MWPTHz, PN, EMT, IEE-EMT [detail] |
2023-01-24 09:40 |
Osaka |
(Primary: On-site, Secondary: Online) |
Cost-effective Network Capacity Expansion by Supplemental Multi-band Transmission on Congested Links Daisuke Saito, Yojiro Mori (Nagoya Univ.), Kohei Hosokawa, Shigeyuki Yanagimachi (NEC), Hiroshi Hasegawa (Nagoya Univ.) PN2022-36 EMT2022-74 MWPTHz2022-62 |
A cost-effective capacity enhancement for photonic networks is proposed, which adopts multi-band transmission only on li... [more] |
PN2022-36 EMT2022-74 MWPTHz2022-62 pp.30-34 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2021-12-02 11:30 |
Online |
Online |
Multi-faceted assessment of language learners' ability of perception and production of English speech based on shadowing Takuya Kunihara, Chuanbo Zhu, Daisuke Saito, Nobuaki Minematsu (UTokyo), Noriko Nakanishi (KGU) NLC2021-19 SP2021-40 |
(To be available after the conference date) [more] |
NLC2021-19 SP2021-40 pp.7-12 |
SDM, ICD, ITE-IST [detail] |
2021-08-18 09:30 |
Online |
Online |
[Invited Talk]
Analog in-memory computing in FeFET based 1T1R array for low-power edge AI applications Daisuke Saito, Toshiyuki Kobayashi, Hiroki Koga (SONY), Yusuke Shuto, Jun Okuno, Kenta Konishi (SSS), Masanori Tsukamoto, Kazunobu Ohkuri (SONY), Taku Umebayashi (SSS), Takayuki Ezaki (SONY) SDM2021-36 ICD2021-7 |
Deep neural network (DNN) inference for edge AI requires low-power operation, which can be achieved by implementing mass... [more] |
SDM2021-36 ICD2021-7 pp.33-37 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 14:05 |
Online |
Online |
[Poster Presentation]
Investigation of DNN-based speech synthesis utilizing oral reading skills obtained from large scale subjective evaluation Shun Akui (UTokyo), Yusuke Ijima (NTT), Daisuke Saito, Nobuaki Minematsu (UTokyo) EA2020-71 SIP2020-102 SP2020-36 |
So far, we have been suggested the value of `oral reading skill' based on a listening evaluation experiment as a quantit... [more] |
EA2020-71 SIP2020-102 SP2020-36 pp.68-73 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-04 10:15 |
Online |
Online |
A quantitative measure of discriminability between NMF dictionaries Eisuke Konno, Daisuke Saito, Nobuaki Minematsu (UTokyo) EA2020-82 SIP2020-113 SP2020-47 |
Supervised nonnegative matrix factorization (NMF) is a popular approach for monaural audio source separation. It realize... [more] |
EA2020-82 SIP2020-113 SP2020-47 pp.134-139 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
[Poster Presentation]
Implementation of a high-accuracy method for automatic fluency scoring of spontaneous English utterances by Japanese learners Ayano Yasukagawa, Shintaro Ando, Eisuke Konno, Zhenchao Lin, Yusuke Inoue, Daisuke Saito, Nobuaki Minematsu (UTokyo), Kazuya Saito (UCL) EA2019-134 SIP2019-136 SP2019-83 |
These days, many teachers claim importance of not native-likeness-based but intelligibility-based assessment of pronunci... [more] |
EA2019-134 SIP2019-136 SP2019-83 pp.189-194 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
[Poster Presentation]
Initial analysis of oral reading skills obtained from large scale subjective evaluation Takuya Ozuru (Univ. of Tokyo), Yusuke Ijima (NTT), Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) EA2019-135 SIP2019-137 SP2019-84 |
Speech of professional newscasters easily suggest us his/her occupation, that is newscaster. So far, we have analyzed pr... [more] |
EA2019-135 SIP2019-137 SP2019-84 pp.195-200 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
[Poster Presentation]
Automatic estimation of prosodic control made in English utterances using DNN-based acoustic models trained with prosodic features and labels Yang Shen, Shintarou Ando, Nobuaki Minematsu, Daisuke Saito (UTokyo), Satoshi Kobashikawa (NTT) EA2019-136 SIP2019-138 SP2019-85 |
This paper investigate how to utilize DNN acoustic models trained with prosodic features and labels to detect prosodic e... [more] |
EA2019-136 SIP2019-138 SP2019-85 pp.201-206 |
SP |
2019-08-28 14:40 |
Kyoto |
Kyoto Univ. |
[Poster Presentation]
Analysis of prosodic differences between a newscaster and amateur speakers using partial-substituted synthetic speech Takuya Ozuru (Univ. of Tokyo), Yusuke Ijima (NTT), Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) SP2019-11 |
This paper analyzes prosodic differences between a professional newscaster and amateur speakers which affects listeners’... [more] |
SP2019-11 pp.13-18 |
SP |
2019-06-13 14:20 |
Kanagawa |
Tokyo Institute of Technology |
A large collection of sentences read aloud by Vietnamese learners of Japanese and native speakers' reverse shadowings Shintaro Ando, Tasavat Trisitichoke, Yusuke Inoue, Fuki Yoshizawa, Daisuke Saito, Nobuaki Minematsu (UTokyo) SP2019-3 |
The main objective of language learning is to acquire good communication skills in the target language.
From that viewp... [more] |
SP2019-3 pp.13-17 |
SP |
2019-06-13 14:45 |
Kanagawa |
Tokyo Institute of Technology |
Evaluation of Comprehensibility of L2 Speech Based on Native Listeners’ Reverse Shadowing and Their Facial Expressions Tasavat Trisitichoke, Shintaro Ando, Daisuke Saito, Nobuaki Minematsu (UTokyo) SP2019-4 |
Recently, researchers' attention has been paid to pronunciation assessment not based on comparison between L2 utterances... [more] |
SP2019-4 pp.19-24 |