Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
ICD |
2023-04-10 13:20 |
Kanagawa |
(Primary: On-site, Secondary: Online) |
[Invited Talk]
Novel scheme of HZO/Si FeFET reservoir computing for speech recognition Eishin Nako, Kasidit Toprasertpong, Ryosho Nakane, Mitsuru Takenaka, Shinichi Takagi (The Univ. of Tokyo) |
(To be available after the conference date) [more] |
|
ET |
2023-03-14 14:10 |
Tokushima |
Tokushima University (Primary: On-site, Secondary: Online) |
HMD-type customer service training support system using eye tracking Takeru Oue, Yukihiro Matsubara, Kousuke Mouri, Masaru Okamoto (Hiroshima City Univ.) ET2022-71 |
In this paper, customer service training support system using HMD and eye tracking approach are developed. By using this... [more] |
ET2022-71 pp.73-78 |
SIS |
2023-03-03 11:10 |
Chiba |
Chiba Institute of Technology (Primary: On-site, Secondary: Online) |
Investigation of introducing data augmentation methods to improve speech enhancement performance Reito Kasuga, Yosuke Sugiura, Nozomiko Yasui, Tetsuya Shimamura (Saitama Univ.) SIS2022-52 |
The field of speech enhancement has been extensively researched worldwide, and many speech enhancement methods have been... [more] |
SIS2022-52 pp.64-69 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
A Study on Scheduled Sampling for Neural Transducer-based ASR Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura (NTT) EA2022-100 SIP2022-144 SP2022-64 |
In this paper, we propose scheduled sampling approaches suited for the recurrent neural network-transducer (RNNT) that i... [more] |
EA2022-100 SIP2022-144 SP2022-64 pp.147-152 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 10:10 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Vocabulary-Set Decomposition and Multi-task Learning for Target Vocabulary Extraction in Japanese Speech Recognition Aoi Ito (LINE/Hosei Univ.), Tatsuya Komatsu, Yusuke Fujita (LINE) EA2022-102 SIP2022-146 SP2022-66 |
This paper proposes a target vocabulary extraction method for Japanese speech recognition models based on vocabulary set... [more] |
EA2022-102 SIP2022-146 SP2022-66 pp.159-164 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 13:45 |
Okinawa |
(Primary: On-site, Secondary: Online) |
[Invited Talk]
Speech and Language Research in the Google Tokyo Office Michiel Bacchiani (Google) EA2022-116 SIP2022-160 SP2022-80 |
This talk will consist of three parts. In the first part of the talk, I will reflect on some lessons learned from the ac... [more] |
EA2022-116 SIP2022-160 SP2022-80 pp.239-240 |
EA, US (Joint) |
2022-12-22 16:50 |
Hiroshima |
Satellite Campus Hiroshima |
[Poster Presentation]
Data augmentation method for machine learning on speech data Tsubasa Maruyama (Tokyo Tech), Tsutomu Ikegami (AIST), Toshio Endo (Tokyo Tech), Takahiro Hirofuchi (AIST) EA2022-68 |
In machine learning, data augmentation is a method to enhance the number and diversity of data by adding transformations... [more] |
EA2022-68 pp.42-48 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2022-11-30 15:30 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Semi-supervised joint training of text to speech and automatic speech recognition using unpaired text data Naoki Makishima, Satoshi Suzuki, Atsushi Ando, Ryo Masumura (NTT) NLC2022-14 SP2022-34 |
This paper presents a novel joint training of text to speech (TTS) and automatic speech recognition (ASR) with small amo... [more] |
NLC2022-14 SP2022-34 pp.27-32 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2022-12-01 14:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
A Japanese Automatic Speech Recognition System on the Next-Gen Kaldi Framework Wen Shen Teo, Yasuhiro Minami (UEC) NLC2022-16 SP2022-36 |
2021 saw the introduction of the cutting-edge successor to the Kaldi speech processing toolkit, known as Next-Gen Kaldi.... [more] |
NLC2022-16 SP2022-36 pp.39-44 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2022-12-01 15:20 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Domain and language adaptation of large-scale pretrained model for speech recognition of low-resource language Kak Soky (Kyoto University), Sheng Li (NICT), Chenhui Chu, Tatsuya Kawahara (Kyoto University) NLC2022-17 SP2022-37 |
The self-supervised learning (SSL) models are effective for automatic speech recognition (ASR). Due to the huge paramete... [more] |
NLC2022-17 SP2022-37 pp.45-49 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2022-12-01 15:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
ASR model adaptation to target domain with large-scale audio data without transcription Takahiro Kinouchi, Daiki Mori (TUT), Ogawa Atsunori (NTT), Norihide Kitaoka (TUT) NLC2022-18 SP2022-38 |
Nowadays, speech recognition is used in various services and businesses thanks to the advent of high-performance models ... [more] |
NLC2022-18 SP2022-38 pp.50-53 |
SP, WIT, IPSJ-SLP [detail] |
2022-10-22 15:40 |
Kyoto |
Kyoto University (Primary: On-site, Secondary: Online) |
Conformer based early fusion model for audio-visual speech recognition Nobukazu Aoki, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada (Tokyo Univ. of Sci.) SP2022-28 WIT2022-3 |
Previous studies of late fusion models with conformer encoders use independent encoders for both visual and audio inform... [more] |
SP2022-28 WIT2022-3 pp.8-13 |
SIS, ITE-BCT |
2022-10-13 14:15 |
Aomori |
Hachinohe Institute of Technology (Primary: On-site, Secondary: Online) |
Toward Improving Speech Naturalness Introducing a Capsule Structure for Speech Enhancement Networks Reito Kasuga, Tetsuya Shimamura, Yosuke Sugiura, Nozomiko Yasui (Saitama Univ.) SIS2022-12 |
Although the field of speech enhancement has been extensively studied around the world, phase tends to be neglected comp... [more] |
SIS2022-12 pp.7-12 |
SIP |
2022-08-26 14:08 |
Okinawa |
Nobumoto Ohama Memorial Hall (Ishigaki Island) (Primary: On-site, Secondary: Online) |
Study on Bone-conducted Speech Enhancement Using Vector-quantized Variational Autoencoder and Gammachirp Filterbank Cepstral Coefficients Quoc-Huy Nguyen, Masashi Unoki (JAIST) SIP2022-71 |
Bone-conducted (BC) speech potentially avoids the undesired effects on recorded speech due to background noise or reverb... [more] |
SIP2022-71 pp.109-114 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-17 15:00 |
Online |
Online |
Representation and analytical normalization for vocal-tract-length transformation by group theory Atsushi Miyashita, Tomoki Toda (Nagoya Univ) SP2022-11 |
In automatic speech recognition, a recognition result should be invariant with respect to acoustic changes caused by dif... [more] |
SP2022-11 pp.41-46 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-18 13:00 |
Online |
Online |
[Poster Presentation]
Proposal of Speech Content Conversion and the Initial Trial: Conversion of Linguistic Information Depending on Situations Kohei Takita, Saizo Aoyagi, Tatsunori Hirai (Komazawa Univ.) SP2022-19 |
It is important to speak dialects, honorifics, and simple words for listeners and the environment in order to smooth com... [more] |
SP2022-19 pp.82-87 |
IMQ |
2022-05-27 13:35 |
Tokyo |
|
Implementation of subtitling system using AR and study of display position Suga Masaki, Tetsuya Matsumoto (Nagoya Univ..), Yoshinori Takeuchi (Daido Univ.), Hiroaki Kudo (Nagoya Univ..) IMQ2022-1 |
Sign language interpretation and captioning are used as substitute information for hearing impaired people. One of the p... [more] |
IMQ2022-1 pp.1-6 |
SIP, BioX, IE, MI, ITE-IST, ITE-ME [detail] |
2022-05-20 11:30 |
Kumamoto |
Kumamoto University Kurokami Campus (Primary: On-site, Secondary: Online) |
Implementation of a Lightweight Automatic Speech Recognition System at the Edge Haotian Tan, Junichi Akita (Kanazawa Univ.) |
Automatic speech recognition (ASR) on the cloud has been widely adopted and has demonstrated satisfactory performance. W... [more] |
|
EA, SIP, SP, IPSJ-SLP [detail] |
2022-03-01 12:45 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Incorporating Acoustic and Textual Information for Language Modeling in Code-switching Speech Recognition Roland Hartanto, Kuniaki Uto, Koichi Shinoda (TokyoTech) EA2021-73 SIP2021-100 SP2021-58 |
People who speak two or more languages tend to alternate the language when they are speaking. This particular phenomenon... [more] |
EA2021-73 SIP2021-100 SP2021-58 pp.56-63 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2021-12-02 15:20 |
Online |
Online |
improvement of multilingual speech emotion recognition by normalizing features using CRNN Jinhai Qi, Motoyuki Suzuki (OIT) NLC2021-22 SP2021-43 |
In this research, a new multilingual emotion recognition method by normalizing features using CRNN has been proposed. We... [more] |
NLC2021-22 SP2021-43 pp.22-26 |