Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-02-29 10:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Multi-task learning with age information model for highly accurate elderly speech recognition. Shine Takumi, Kinouchi Takahiro, Wakabayashi Yukoh, Kitaoka Norihide (TUT) EA2023-64 SIP2023-111 SP2023-46 |
The speech recognition of the elderly is less accurate, especially in smart speaker speech recognition, due to aging-rel... [more] |
EA2023-64 SIP2023-111 SP2023-46 pp.19-24 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Constructing and Evaluating a Batch Voice Input System for Electronic Medical Records Using Large Language Models Ryo Maejima, Norihide Kitaoka (TUT) EA2023-99 SIP2023-146 SP2023-81 |
This study aims to develop an electronic medical record with a voice input interface that lets users input several items... [more] |
EA2023-99 SIP2023-146 SP2023-81 pp.226-231 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Domain adaptation of speech recognition model based on multilingual SSL model with only nonparallel corpus. Takahiro Kinouchi (TUT), Atsunori Ogawa (NTT), Yukoh Wakabayashi (TUT), Kengo Ohta (NITA), Norihide Kitaoka (TUT) EA2023-100 SIP2023-147 SP2023-82 |
Automatic speech recognition (ASR) models are used in various services and businesses, and each domain’s recognition acc... [more] |
EA2023-100 SIP2023-147 SP2023-82 pp.232-237 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Improving speech recognition system consisting of multiple speech recognition models Keigo Hojo, Yukoh Wakabayashi (TUT), Kengo Ohta (NITAC), Atsunori Ogawa (NTT), Norihide Kitaoka (TUT) EA2023-101 SIP2023-148 SP2023-83 |
[more] |
EA2023-101 SIP2023-148 SP2023-83 pp.238-243 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Evaluation of Automatic Speech Recognition for Deaf and Hard-of-Hearing People by Speaker Adaptation. Kaito Takahashi, Takahiro Kinouchi, Yukoh Wakabayashi (TUT), Kengo Ohta (NITAC), Akio Kobayashi (Yamato Univ.), Norihide Kitaoka (TUT) EA2023-102 SIP2023-149 SP2023-84 |
Communication between normal-hearing people and the deaf is generally used sign language, written communication, and spe... [more] |
EA2023-102 SIP2023-149 SP2023-84 pp.244-249 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 10:40 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Intermediate speaker speech synthesis between two speakers using x-vector speaker space Sota Hosoi, Takahiro Kinouchi, Yukoh Wakabayashi, Norihide Kitaoka (TUT) EA2023-103 SIP2023-150 SP2023-85 |
Recent advancements in speech synthesis technologies have enabled the synthesis of speeches of speakers not in the train... [more] |
EA2023-103 SIP2023-150 SP2023-85 pp.250-255 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 10:40 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Substitution of Implicit Linguistic Information in Beam Search Decoding Using CTC-based Speech Recognition Models Tatsunari Takagi, Yukoh Wakabayashi (TUT), Atsunori Ogawa (NTT), Norihide Kitaoka (TUT) EA2023-106 SIP2023-153 SP2023-88 |
The rise of neural networks in the field of automatic speech recognition has notably improved the accuracy of speech rec... [more] |
EA2023-106 SIP2023-153 SP2023-88 pp.268-273 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-23 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Streaming End-to-End speech recognition using a CTC decoder with substituted linguistic information Tatsunari Takagi (TUT), Atsunori Ogawa (NTT), Norihide Kitaoka, Yukoh Wakabayashi (TUT) SP2023-12 |
Speech recognition technology has been employed in various fields due to the enhancement of speech recognition model acc... [more] |
SP2023-12 pp.60-64 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-24 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Domain adaptation of speech recognition models based on self-supervised learning using target domain speech Takahiro Kinouchi (TUT), Atsunori Ogawa (NTT), Yuko Wakabayashi, Norihide Kitaoka (TUT) SP2023-19 |
In this study, we propose a domain adaptation method using only speech data in the target domain without using transcrib... [more] |
SP2023-19 pp.91-96 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-24 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Automatic speech recognition model simultaneously recognizes linguistic information and verbal/non-verbal phenomena Nagito Shione, Yukoh Wakabayashi, Norihide Kitaoka (TUT) SP2023-22 |
Although speech recognition technology has advanced in recent years, most of them recognize only linguistic information ... [more] |
SP2023-22 pp.109-113 |
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 15:05 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Construction of Language Model for Low-resource Domain Speech Recognition Based on Sentence Generation Ryo Maejima, Daiki Mori, Youkoh Wakabayashi, Norihide Kitaoka (TUT) |
[more] |
|
SP, IPSJ-SLP, EA, SIP [detail] |
2023-03-01 15:10 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Automatic Speech Recognition model using data with verbal and non-verbal information tag Nagito Shione, Yukoh Wakabayashi, Norihide Kitaoka (TUT) |
[more] |
|
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2022-11-29 14:35 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Density Ratio Approach-based multiple Encoder-Decoder ASR model integration Keigo Hojo, Daiki Mori, Yukoh Wakabayashi (TUT), Atsunori Ogawa (NTT), Norihide Kitaoka (TUT) NLC2022-10 SP2022-30 |
One of the methods to improve the performance of Encoder--Decoder speech recognition is the integration of an ASR models... [more] |
NLC2022-10 SP2022-30 pp.5-9 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2022-12-01 15:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
ASR model adaptation to target domain with large-scale audio data without transcription Takahiro Kinouchi, Daiki Mori (TUT), Ogawa Atsunori (NTT), Norihide Kitaoka (TUT) NLC2022-18 SP2022-38 |
Nowadays, speech recognition is used in various services and businesses thanks to the advent of high-performance models ... [more] |
NLC2022-18 SP2022-38 pp.50-53 |
WIT, SP, IPSJ-SLP [detail] |
2020-10-22 14:10 |
Online |
Online |
Early Dementia Detection based on Speech and Language Information Maina Umezawa, Yurie Iribe (Aichi Prefectural Univ.), Norihide Kitaoka (Toyohashi Tech) SP2020-12 WIT2020-13 |
In recent years, research has been conducted to detect people with mild dementia from dialogue voices of the elderly. Bu... [more] |
SP2020-12 WIT2020-13 pp.21-26 |
PRMU, SP |
2018-06-29 11:30 |
Nagano |
|
Mapping Acoustic Vector Sequence to Document Vector Based on RNN Ryota Nishimura, Miho Higaki, Norihide Kitaoka (Tokushima Univ.) PRMU2018-32 SP2018-12 |
In this research, we propose a method of searching between different media (cross media mapping) using deep learning (Ma... [more] |
PRMU2018-32 SP2018-12 pp.59-64 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2017-12-21 12:50 |
Tokyo |
Waseda Univ. Green Computing Systems Research Organization |
[Poster Presentation]
Selecting Response from Conversational Spoken Dialogue System Based on Distributed Representation of User Utterances Kengo Ohta (NIT, Anan College), Ryota Nishimura, Norihide Kitaoka (Tokushima Univ.) SP2017-55 |
[more] |
SP2017-55 pp.1-5 |
WIT, SP |
2017-10-19 14:20 |
Fukuoka |
Tobata Library of Kyutech (Kitakyushu) |
User adaptation of examples for example-based reminiscence therapy spoken dialog system using word embedding Eichi Seto, Ryota Nishimura, Norihide Kitaoka (Tokushima Univ.) SP2017-38 WIT2017-34 |
We are developing a spoken dialog system for reminiscence therapy. We propose an example-based dialog system featuring a... [more] |
SP2017-38 WIT2017-34 pp.23-28 |
SP |
2016-08-24 13:00 |
Kyoto |
ACCMS, Kyoto Univ. |
Adaptation Methods for Daily Activity Recognition Based on Deep Neural Network Tomoki Hayashi (Nagoya Univ.), Norihide Kitaoka (Tokushima Univ.), Tomoki Toda, Kazuya Takeda (Nagoya Univ.) SP2016-27 |
Our objective is to build a monitoring system which enables elderly people to live actively, and the key technology to a... [more] |
SP2016-27 pp.1-6 |
SP |
2015-10-16 11:15 |
Hyogo |
Kobe Univ. |
Multi-modal speech recognition using deep bottleneck features Satoshi Tamura (Gifu Univ), Hiroshi Ninomiya (Nagoya Univ), Norihide Kitaoka (Tokushima Univ), Shin Osuga (Aisin Seiki), Yurie Iribe (Aichi Prefectural Univ), Kazuya Takeda (Nagoya Univ), Satoru Hayamizu (Gifu Univ) SP2015-69 |
In this paper, we propose a novel multi-modal speech recognition method which uses speech and lip images, employing Deep... [more] |
SP2015-69 pp.57-62 |