Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
EA, SIP, SP, IPSJ-SLP [detail] |
2025-03-02 11:40 |
Okinawa |
(Okinawa) |
Speech-Activity-Guided Speaker Embedding Extraction Shota Horiguchi, Takafumi Moriya, Atsushi Ando, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, Marc Delcroix (NTT) |
(To be available after the conference date) [more] |
|
EA, SIP, SP, IPSJ-SLP [detail] |
2025-03-03 10:25 |
Okinawa |
(Okinawa) |
Study on a Japanese Speech Understanding Model Robust to Multi-Item Questioning Yuki Takashima, Atsushi Ando, Taichi Asami (NTT) |
[more] |
|
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 15:50 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
The linguistic influence on speaker verification based on Self-Supervised Learning Tomoka Wakamatsu (Tokyo Metropolitan Univ.), Atsushi Ando (NTT), Sayaka Shiota (Tokyo Metropolitan Univ.), Ryo Masumura (NTT), Hitoshi Kiya (Tokyo Metropolitan Univ.) EA2022-118 SIP2022-162 SP2022-82 |
In recent years, statistical models utilizing Self-Supervised Learning (SSL) have been employed in various fields
It ha... [more] |
EA2022-118 SIP2022-162 SP2022-82 pp.247-252 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2022-11-30 15:30 |
Tokyo |
(Tokyo, Online) (Primary: On-site, Secondary: Online) |
Semi-supervised joint training of text to speech and automatic speech recognition using unpaired text data Naoki Makishima, Satoshi Suzuki, Atsushi Ando, Ryo Masumura (NTT) NLC2022-14 SP2022-34 |
This paper presents a novel joint training of text to speech (TTS) and automatic speech recognition (ASR) with small amo... [more] |
NLC2022-14 SP2022-34 pp.27-32 |
EA, SIP, SP, IPSJ-SLP [detail] |
2022-03-02 10:20 |
Okinawa |
(Okinawa, Online) (Primary: On-site, Secondary: Online) |
A Study on Hybrid RNN-T/Attention-based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration Takafumi Moriya, Takanori Ashihara, Atsushi Ando, Hiroshi Sato, Tomohiro Tanaka, Kohei Matsuura, Ryo Masumura, Marc Delcroix (NTT), Takahiro Shinozaki (Tokyo Tech) EA2021-78 SIP2021-105 SP2021-63 |
In this paper we propose improvements to our recently proposed hybrid RNN-T/Attention architecture that includes a share... [more] |
EA2021-78 SIP2021-105 SP2021-63 pp.90-95 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 17:10 |
Online |
Online (Online) |
An investigation of rhythm-based speaker embeddings for phoneme duration modeling Kenichi Fujita, Atsushi Ando, Yusuke Ijima (NTT) EA2020-77 SIP2020-108 SP2020-42 |
In this study, we propose a speaker embedding method suitable for modeling phoneme duration length for each individual i... [more] |
EA2020-77 SIP2020-108 SP2020-42 pp.103-108 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2019-12-06 16:25 |
Tokyo |
NHK Science & Technology Research Labs. (Tokyo) |
An evaluation of representation learning using phoneme posteriorgrams and data augmentation in speech emotion recognition Shintaro Okada (Nagoya Univ.), Atsushi Ando (Nagoya Univ./NTT), Tomoki Toda (Nagoya Univ.) SP2019-43 |
This paper presents a new speech emotion recognition method based on representation learning and data augmentation.
To ... [more] |
SP2019-43 pp.91-96 |
SP |
2019-08-28 17:00 |
Kyoto |
Kyoto Univ. (Kyoto) |
Speech Emotion Classification based on Multi-Label Emotion Existence Estimation Atsushi Ando, Ryo Masumura, Hosana Kamiyama, Satoshi Kobashikawa, Yushi Aono (NTT) SP2019-16 |
This paper presents a novel speech emotion classification that addresses the ambiguous nature of emotions in speech. Mos... [more] |
SP2019-16 pp.39-44 |
EA, SIP, SP |
2019-03-14 13:30 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) (Nagasaki) |
[Poster Presentation]
Initial analysis of emotional speech acted in noise Yi Zhao (NII), Atsushi Ando (NTT), Shinji Takaki, Junichi Yamagishi (NII), Satoshi Kobashikawa (NTT) EA2018-120 SIP2018-126 SP2018-82 |
Speakers usually adjust their way of talking in noisy environments involuntarily for effective communication, this adapt... [more] |
EA2018-120 SIP2018-126 SP2018-82 pp.125-130 |
EA, SIP, SP |
2019-03-15 10:25 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) (Nagasaki) |
Neural Language Models based on Conditional Hierarchical Recurrent Encoder-Decoder for Multi-Party Conversational Speech Recognition Ryo Masumura, Tomohiro Tanaka, Atsushi Ando, Takanobu Oba, Yushi Aono (NTT) EA2018-131 SIP2018-137 SP2018-93 |
This paper presents fully neural network based language models (LMs) that can leverage long-range conversational context... [more] |
EA2018-131 SIP2018-137 SP2018-93 pp.191-196 |
EA, SIP, SP |
2019-03-15 10:50 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) (Nagasaki) |
Likability Estimation Model Training of Call-center Agents Based on Annotators' Skills Hosana Kamiyama, Atsushi Ando, Ryo Masumura, Satoshi Kobashikawa, Yushi Aono (NTT) EA2018-132 SIP2018-138 SP2018-94 |
This paper proposes a new technique for estimating the likability of call-center agents.
Most techniques of likability ... [more] |
EA2018-132 SIP2018-138 SP2018-94 pp.197-202 |
NLC, IPSJ-IFAT |
2019-02-07 14:45 |
Kyoto |
Ryukoku University Omiya Campus (Kyoto) |
Call Scene Segmentation based on Neural Networks with Conversational Contexts Ryo Masumura, Tomohiro Tanaka, Atsushi Ando, Hosana Kamiyama, Takanobu Oba, Yushi Aono (NTT) NLC2018-39 |
Call scene segmentation that automatically splits contact center dialogues into several call scenes is useful for constr... [more] |
NLC2018-39 pp.21-26 |
SP, IPSJ-SLP (Joint) |
2017-07-28 11:15 |
Miyagi |
Akiu Resort Hotel Crescent (Miyagi) |
Speaker Diarization for Face-to-Face Dialog of Service Counters Based on Appearance Pattern of Speakers Mizuki Watabe (NTT DOCOMO), Atsushi Ando, Hosana Kamiyama, Satoshi Kobashikawa, Yushi Aono (NTT), Takanobu Oba, Yoshinori Isoda (NTT DOCOMO) SP2017-19 |
This paper proposes a speaker diarization method for face-to-face dialogue of service counters using appearance pattern ... [more] |
SP2017-19 pp.21-26 |
SDM |
2015-06-19 17:10 |
Aichi |
VBL, Nagoya Univ. (Aichi) |
[Invited Lecture]
Fabrication and Characterization of MoS2 MOSFET with High-k/Metal Gate Takahiro Mori (AIST), Naruki Ninomiya (YNU), Noriyuki Uchida, Toshitaka Kubo (AIST), Eiichiro Watanabe, Daiju Tsuya, Satoshi Moriyama (NIMS), Masatoshi Tanaka (YNU), Atsushi Ando (AIST) SDM2015-56 |
We report the device fabrication and characterization of the high-k/metal gate MoS2 MOSFETs. To investigate the scatteri... [more] |
SDM2015-56 pp.99-103 |
SP, IPSJ-SLP |
2012-12-21 14:40 |
Tokyo |
TITECH(Ookayama) (Tokyo) |
Reduction of cross spectrum for feature-domain sound source separation Atsushi Ando (Nagoya Univ.), Kenta Niwa (NTT), Norihide Kitaoka, Kazuya Takeda (Nagoya Univ.) SP2012-93 |
Speech source separation is utilized for recognition of simultaneous speech. Conventional source separation methods, esp... [more] |
SP2012-93 pp.107-112 |
PRMU, SP |
2012-02-10 15:20 |
Miyagi |
(Miyagi) |
Multi-band speech recognition using confidence of blind source separation Atsushi Ando, Hiromasa Ohashi (Nagoya Univ.), Sunao Hara (NAIST), Norihide Kitaoka, Kazuya Takeda (Nagoya Univ.) PRMU2011-234 SP2011-149 |
One of the main applications of Blind Source Separation (BSS) is to improve performance of Automatic Speech Recognition ... [more] |
PRMU2011-234 SP2011-149 pp.219-224 |