Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
EA, SIP, SP, IPSJ-SLP [detail] |
2025-03-04 09:30 |
Okinawa |
|
[Poster Presentation]
An Analysis of Speaker Representation for Target-Speaker Speech Processing Takanori Ashihara, Takafumi Moriya, Shota Horiguchi (NTT), Junyi Peng (BUT), Tsubasa Ochiai, Marc Delcroix, Kohei Matsuura, Hiroshi Sato (NTT) EA2024-129 SIP2024-164 SP2024-70 |
Target-speaker (TS) speech processing tasks, including TS automatic speech recognition (TS-ASR), target speech extractio... [more] |
EA2024-129 SIP2024-164 SP2024-70 pp.319-324 |
EA, SIP, SP, IPSJ-SLP [detail] |
2025-03-04 11:05 |
Okinawa |
|
[Poster Presentation]
Improvement of Speech Recognition Performance for Elderly Speech by Alternating Learning of Acoustic and Linguistic information Kaito Takahashi, Yukoh Wakabayashi (TUT), Kengo Ohta (NIT, Anan College), Norihide Kitaoka (TUT) EA2024-142 SIP2024-177 SP2024-83 |
In recent years, the accuracy of speech recognition technology has significantly improved, leading to its widespread use... [more] |
EA2024-142 SIP2024-177 SP2024-83 pp.391-396 |
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] |
2024-12-12 14:50 |
Aichi |
Nagoya Univ. (Primary: On-site, Secondary: Online) |
[Poster Presentation]
Speaker-Discriminative CTC for Multi-Talker Speech Recognition Asahi Sakuma, Hiroaki Sato, Ryuga Sugano, Tadashi Kumano, Yoshihiko Kawai (NHK), Tetsuji Ogawa (Waseda Univ.) NLC2024-20 SP2024-11 |
In this paper, we propose a novel method Speaker-Discriminative CTC (SD-CTC) to improve speech recognition accuracy on c... [more] |
NLC2024-20 SP2024-11 pp.6-11 |
TL |
2024-08-11 11:00 |
Hyogo |
Kwansei Gakuin Univ. |
An effect of pronunciation training using Automatic Speech Recognition on L2 learners’ listening comprehension Fumio Ozawa, Manabu Arai, Yumiko Mizusawa (Seijo Univ.) TL2024-12 |
The current study aims to examine an effect of pronunciation training using Automatic Speech Recognition (ASR) on L2 lea... [more] |
TL2024-12 pp.24-29 |
EA |
2024-05-22 14:15 |
Online |
Online |
Why speech enhancement degrades speech recognition performance?
-- Analysis of effect of speech enhancement errors on speech recognition performance -- Tsubasa Ochiai (NTT), Kazuma Iwamoto (Doshisha Univ.), Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki (NTT), Shigeru Katagiri (Doshisha Univ.) EA2024-4 |
Deep learning techniques have dramatically improved the speech enhancement (SE) performance of single-channel SE. Howeve... [more] |
EA2024-4 pp.20-21 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-02-29 15:45 |
Okinawa |
(Primary: On-site, Secondary: Online) |
|
We have developed automatic speech recognition and dialect identification techniques by using COJADS, a corpus of Japane... [more] |
|
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 10:40 |
Okinawa |
(Primary: On-site, Secondary: Online) |
An Investigation into Weighting Strategies for Model Averaging in Continual Learning for Automatic Speech Recognition Kentaro Shinayama, Hiroshi Sato, Tomoharu Iwata, Takeshi Mori, Taichi Asami (NTT) EA2023-105 SIP2023-152 SP2023-87 |
In recent years, the application scope of speech recognition AI has expanded, enabling the acquisition of diverse data d... [more] |
EA2023-105 SIP2023-152 SP2023-87 pp.262-267 |
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] |
2023-12-03 09:30 |
Tokyo |
Kikai-Shinko-Kaikan Bldg. (Primary: On-site, Secondary: Online) |
Enhancing Recognition of Rare Words in ASR through Error Detection and Context-Aware Error Correction Jiajun He, Zekun Yang, Tomoki Toda (Nagoya Univ.) NLC2023-16 SP2023-36 |
Automatic speech recognition (ASR) systems often suffer from errors, particularly when recognizing rare words. These err... [more] |
NLC2023-16 SP2023-36 pp.13-18 |
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] |
2023-12-03 11:05 |
Tokyo |
Kikai-Shinko-Kaikan Bldg. (Primary: On-site, Secondary: Online) |
[Poster Presentation]
Enhancing Multi-Accent Automated Speech Recognition with Accent-Activated Adapters Yuqin Lin, Longbiao Wang, Jianwu Dang (Tianjin Univ. & Univ. of Tokyo), Nobuaki Minematsu (Univ. of Tokyo) NLC2023-18 SP2023-38 |
This paper proposes the Accent-Activated adapter (AccentAct) approach to address the challenge of speech variations in m... [more] |
NLC2023-18 SP2023-38 pp.25-30 |
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] |
2023-12-03 11:05 |
Tokyo |
Kikai-Shinko-Kaikan Bldg. (Primary: On-site, Secondary: Online) |
[Poster Presentation]
Enhancing Dysarthric Speech Recognition with Auxiliary Feature Fusion Module: Exploring Articulatory-related Features from Foundation Models Yuqin Lin, Longbiao Wang, Jianwu Dang (Tianjin Univ. & Univ. of Tokyo), Nobuaki Minematsu (Univ. of Tokyo) NLC2023-19 SP2023-39 |
Addressing dysarthric speech variability in Automatic Speech Recognition (ASR) is crucial for improving human-computer i... [more] |
NLC2023-19 SP2023-39 pp.31-36 |
ET |
2023-10-21 15:30 |
Nagano |
Shinshu University Faculty of Engineering |
"Listening" Performance of Generative AI and Elementary Foreign Language Learners in Code-Switching Discourse Sunaoka Kazuko (Waseda Univ.), Qin Xu (Kyoto Univ.) ET2023-23 |
We used the Whisper model to automatically recognize and process teachers' Japanese and Chinese code-switching (CS) in a... [more] |
ET2023-23 pp.33-37 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-23 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
[Poster Presentation]
Generation of colored subtitle images based on emotional information of speech utterances Fumiya Nakamura (Kobe Univ.), Ryo Aihara (Mitsubishi Electric), Ryoichi Takashima, Tetsuya Takiguchi (Kobe Univ.), Yusuke Itani (Mitsubishi Electric) SP2023-11 |
Conventional automatic subtitle generation systems based on speech recognition do not take into account paralinguistic i... [more] |
SP2023-11 pp.54-59 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-24 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Automatic speech recognition model simultaneously recognizes linguistic information and verbal/non-verbal phenomena Nagito Shione, Yukoh Wakabayashi, Norihide Kitaoka (TUT) SP2023-22 |
Although speech recognition technology has advanced in recent years, most of them recognize only linguistic information ... [more] |
SP2023-22 pp.109-113 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
A Study on Scheduled Sampling for Neural Transducer-based ASR Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura (NTT) EA2022-100 SIP2022-144 SP2022-64 |
In this paper, we propose scheduled sampling approaches suited for the recurrent neural network-transducer (RNNT) that i... [more] |
EA2022-100 SIP2022-144 SP2022-64 pp.147-152 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 10:10 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Vocabulary-Set Decomposition and Multi-task Learning for Target Vocabulary Extraction in Japanese Speech Recognition Aoi Ito (LINE/Hosei Univ.), Tatsuya Komatsu, Yusuke Fujita (LINE) EA2022-102 SIP2022-146 SP2022-66 |
This paper proposes a target vocabulary extraction method for Japanese speech recognition models based on vocabulary set... [more] |
EA2022-102 SIP2022-146 SP2022-66 pp.159-164 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2022-11-30 15:30 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Semi-supervised joint training of text to speech and automatic speech recognition using unpaired text data Naoki Makishima, Satoshi Suzuki, Atsushi Ando, Ryo Masumura (NTT) NLC2022-14 SP2022-34 |
This paper presents a novel joint training of text to speech (TTS) and automatic speech recognition (ASR) with small amo... [more] |
NLC2022-14 SP2022-34 pp.27-32 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2022-12-01 14:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
A Japanese Automatic Speech Recognition System on the Next-Gen Kaldi Framework Wen Shen Teo, Yasuhiro Minami (UEC) NLC2022-16 SP2022-36 |
2021 saw the introduction of the cutting-edge successor to the Kaldi speech processing toolkit, known as Next-Gen Kaldi.... [more] |
NLC2022-16 SP2022-36 pp.39-44 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-17 15:00 |
Online |
Online |
Representation and analytical normalization for vocal-tract-length transformation by group theory Atsushi Miyashita, Tomoki Toda (Nagoya Univ) SP2022-11 |
In automatic speech recognition, a recognition result should be invariant with respect to acoustic changes caused by dif... [more] |
SP2022-11 pp.41-46 |
SIP, BioX, IE, MI, ITE-IST, ITE-ME [detail] |
2022-05-20 11:30 |
Kumamoto |
Kumamoto University Kurokami Campus (Primary: On-site, Secondary: Online) |
Implementation of a Lightweight Automatic Speech Recognition System at the Edge Haotian Tan, Junichi Akita (Kanazawa Univ.) |
Automatic speech recognition (ASR) on the cloud has been widely adopted and has demonstrated satisfactory performance. W... [more] |
|
SP, EA, SIP |
2020-03-02 13:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
Data augmentation for ASR system by using locally time-reversed speech
-- Temporal inversion of feature sequence -- Takanori Ashihara, Tomohiro Tanaka, Takafumi Moriya, Ryo Masumura, Yusuke Shinohara, Makio Kashino (NTT) EA2019-110 SIP2019-112 SP2019-59 |
Data augmentation is one of the techniques to mitigate overfitting and improve robustness against several acoustic varia... [more] |
EA2019-110 SIP2019-112 SP2019-59 pp.53-58 |