Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-02-28 09:50 |
Okinawa |
(Primary: On-site, Secondary: Online) |
End-to-End Speech Synthesis Based on Articulatory Movements Captured by Real-time MRI Yuto Otani, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada (Tokyo Univ. Sci.) EA2022-77 SIP2022-121 SP2022-41 |
We propose an end-to-end deep learning model for speech synthesis based on articulatory movements captured by real-time ... [more] |
EA2022-77 SIP2022-121 SP2022-41 pp.13-18 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
A Study on Scheduled Sampling for Neural Transducer-based ASR Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura (NTT) EA2022-100 SIP2022-144 SP2022-64 |
In this paper, we propose scheduled sampling approaches suited for the recurrent neural network-transducer (RNNT) that i... [more] |
EA2022-100 SIP2022-144 SP2022-64 pp.147-152 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2022-12-01 15:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
ASR model adaptation to target domain with large-scale audio data without transcription Takahiro Kinouchi, Daiki Mori (TUT), Ogawa Atsunori (NTT), Norihide Kitaoka (TUT) NLC2022-18 SP2022-38 |
Nowadays, speech recognition is used in various services and businesses thanks to the advent of high-performance models ... [more] |
NLC2022-18 SP2022-38 pp.50-53 |
R |
2022-07-29 13:55 |
Hokkaido |
(Primary: On-site, Secondary: Online) |
A Comparison Study on Image Captioning by VGG and YOLO Yan LYU, Qiangfu Zhao, Yong Liu (UoA) R2022-10 |
Image captioning is a task for generating a descriptive statement automatically for a given image by combining image pro... [more] |
R2022-10 pp.7-12 |
EA, SIP, SP, IPSJ-SLP [detail] |
2022-03-02 10:20 |
Okinawa |
(Primary: On-site, Secondary: Online) |
A Study on Hybrid RNN-T/Attention-based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration Takafumi Moriya, Takanori Ashihara, Atsushi Ando, Hiroshi Sato, Tomohiro Tanaka, Kohei Matsuura, Ryo Masumura, Marc Delcroix (NTT), Takahiro Shinozaki (Tokyo Tech) EA2021-78 SIP2021-105 SP2021-63 |
In this paper we propose improvements to our recently proposed hybrid RNN-T/Attention architecture that includes a share... [more] |
EA2021-78 SIP2021-105 SP2021-63 pp.90-95 |
IN, IA (Joint) |
2021-12-17 17:40 |
Hiroshima |
Higashi-Senda campus, Hiroshima Univ. (Primary: On-site, Secondary: Online) |
[Short Paper]
On the Impact of Communication Link Heterogeneity on Content Delivery Delay in Information-Centric Delay/Disruption-Tolerant Networking Sagayama Hisashi, Ohnishi Michika, Matsuo Ryotaro, Ohsaki Hiroyuki (Kwansei Gakuin Univ.) IA2021-49 |
In recent years, it is expected that ICDTN (Information-Centric Delay/Disruption-Tolerant Networking) incorpo-
rating t... [more] |
IA2021-49 pp.93-96 |
SP, IPSJ-SLP, IPSJ-MUS |
2021-06-19 15:00 |
Online |
Online |
Neural speech synthesis using local phrase dependency structure information Nobuyoshi Kaiki, Sakriani Sakti, Satoshi Nakamura (NIST) SP2021-23 |
In order to synthesize Japanese speech with natural prosody, we introduce an end-to-end TTS with new prosodic symbol rep... [more] |
SP2021-23 pp.107-112 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 14:05 |
Online |
Online |
[Poster Presentation]
End-to-end incremental TTS with lookahead generation with large pretrained language model Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari (UTokyo) EA2020-74 SIP2020-105 SP2020-39 |
(To be available after the conference date) [more] |
EA2020-74 SIP2020-105 SP2020-39 pp.85-90 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 17:35 |
Online |
Online |
[Short Paper]
Comparison of End-to-End Models for Joint Speaker and Speech Recognition Kak Soky (Kyoto Univ.), Sheng Li (NICT), Masato Mimura, Chenhui Chu, Tatsuya Kawahara (Kyoto Univ.) EA2020-78 SIP2020-109 SP2020-43 |
In this paper, we investigate the effectiveness of using speaker information on the performance of speaker-imbalanced au... [more] |
EA2020-78 SIP2020-109 SP2020-43 pp.109-113 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2020-12-02 09:40 |
Online |
Online |
Fast End-to-End Speech Recognition with CTC and Mask Predict Yosuke Higuchi (Waseda Univ.), Hirofumi Inaguma (Kyoto Univ.), Shinji Watanabe (JHU), Tetsuji Ogawa, Tetsunori Kobayashi (Waseda Univ.) NLC2020-13 SP2020-16 |
We present a fast non-autoregressive (NAR) end-to-end automatic speech recognition (E2E-ASR) framework, which generates ... [more] |
NLC2020-13 SP2020-16 pp.1-6 |
WIT, SP, IPSJ-SLP [detail] |
2020-10-22 13:00 |
Online |
Online |
[Invited Talk]
NHK's activities on Japanese end-to-end speech synthesis Kiyoshi Kurihara (NHK) SP2020-11 WIT2020-12 |
The main business of NHK (Japan Broadcasting Corporation) is the production and broadcasting of programs. Many programs ... [more] |
SP2020-11 WIT2020-12 pp.19-20 |
SP, EA, SIP |
2020-03-02 13:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
Data augmentation for ASR system by using locally time-reversed speech
-- Temporal inversion of feature sequence -- Takanori Ashihara, Tomohiro Tanaka, Takafumi Moriya, Ryo Masumura, Yusuke Shinohara, Makio Kashino (NTT) EA2019-110 SIP2019-112 SP2019-59 |
Data augmentation is one of the techniques to mitigate overfitting and improve robustness against several acoustic varia... [more] |
EA2019-110 SIP2019-112 SP2019-59 pp.53-58 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
[Poster Presentation]
An Educational Study on Prosodic Symbols and Their Acoustic Realization Using Japanese End-to-end Speech Synthesis Fuki Yoshizawa (UTokyo), Tadashi Kumano (NHK), Nobuaki Minematsu (UTokyo), Kiyoshi Kurihara (NHK) EA2019-137 SIP2019-139 SP2019-86 |
In order to examine the educational effect of presenting prosodic symbols to learners of Japanese, a method was proposed... [more] |
EA2019-137 SIP2019-139 SP2019-86 pp.207-212 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2019-12-06 13:55 |
Tokyo |
NHK Science & Technology Research Labs. |
[Poster Presentation]
Effectiveness of sequence-to-sequence acoustic modeling by using automatic generated labels Kiyoshi Kurihara, Nobumasa Seiyama, Tadashi Kumano (NHK) SP2019-37 |
We have proposed a method that uses yomigana (Japanese character readings) and prosodic symbols as input for sequence-to... [more] |
SP2019-37 pp.49-54 |
MVE, ITE-HI, ITE-SIP [detail] |
2019-06-10 10:30 |
Tokyo |
|
Impression Prediction of Oral Presentation Using LSTM with Dot-product Attention Mechanism Shengzhou Yi, Xueting Wang, Toshihiko Yamasaki (UTokyo) MVE2019-1 |
For automatically evaluating oral presentation, we propose an end-to-end system to predict audience’s impression on spee... [more] |
MVE2019-1 pp.1-6 |
EA, SIP, SP |
2019-03-15 13:30 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
[Poster Presentation]
Data augmentation using multiple databases for end-to-end dysarthric speech recognition Yuki Takashima, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) EA2018-156 SIP2018-162 SP2018-118 |
We present in this paper an end-to-end speech recognition system for a Japanese person with an articulation disorder res... [more] |
EA2018-156 SIP2018-162 SP2018-118 pp.335-340 |
SP |
2019-01-27 09:00 |
Ishikawa |
Kanazawa-Harmonie |
[Tutorial Invited Lecture]
Software components towards end-to-end speech synthesis at NII
-- Tutorial for Tacotron and WaveNet -- Yusuke Yasuda, Xin Wang (NII) SP2018-56 |
This presentation describes recent advances of end-to-end speech synthesis. We introduce major approaches and our method... [more] |
SP2018-56 p.21 |
SP |
2019-01-27 10:40 |
Ishikawa |
Kanazawa-Harmonie |
Evaluation of end-to-end speech synthesis method using speaking styles Kiyoshi Kurihara, Nobumasa Seiyama, Tadashi Kumano, Atsushi Imai (NHK) SP2018-58 |
The purpose of this study was to conduct end-to-end text-to-speech synthesis in Japanese; we developed a system that use... [more] |
SP2018-58 pp.29-34 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2018-12-10 13:15 |
Tokyo |
Waseda Univ. Nishiwaseda Campus |
[Invited Talk]
Review of Automatic Speech Recognition Methodology
-- Outlook of Acoustic-to-Word Model -- Tatsuya Kawahara (Kyoto Univ.) SP2018-48 |
The methodology of speech recognition has been changing due to the introduction of deep learning, in particular end-to-e... [more] |
SP2018-48 pp.25-30 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2018-12-10 16:30 |
Tokyo |
Waseda Univ. Nishiwaseda Campus |
Evaluation of Japanese end-to-end speech synthesis method inputting kana and prosodic symbols Kiyoshi Kurihara, Nobumasa Seiyama, Tadashi Kumano, Atsushi Imai (NHK) SP2018-49 |
The purpose of this study was to conduct end-to-end text-to-speech synthesis in Japanese; we developed a system that use... [more] |
SP2018-49 pp.89-94 |