Streaming End-to-End speech recognition using a CTC decoder with substituted linguistic information

Takagi,Tatsunari; Ogawa,Atsunori; Kitaoka,Norihide; Wakabayashi,Yukoh

IEICE Technical Committee Submission System
Conference Paper's Information

Online Proceedings
[Sign in]
Tech. Rep. Archives

Paper Abstract and Keywords
Presentation		2023-06-23 13:50 Streaming End-to-End speech recognition using a CTC decoder with substituted linguistic information Tatsunari Takagi (TUT), Atsunori Ogawa (NTT), Norihide Kitaoka, Yukoh Wakabayashi (TUT) SP2023-12
Abstract	(in Japanese)	(See Japanese page)
	(in English)	Speech recognition technology has been employed in various fields due to the enhancement of speech recognition model accuracy. However, when the domain of the data used for training differs from that of the data to be recognized, recognition accuracy declines. To address this issue, several approaches utilizing language models trained on extensive text data have been proposed. Recently, the Density Ratio Approach (DRA), an extension of Shallow Fusion, has been introduced as a method for integrating speech recognition models with language models. Nevertheless, the application of DRA to streaming speech recognition models using a CTC decoder for Japanese speech has not been investigated. In this study, we conducted domain adaptation using DRA in streaming speech recognition with a CTC decoder. To facilitate streaming processing, the decoder successively replaces linguistic information on a frame-by-frame basis, obtaining recognition results through greedy search. Furthermore, we selected the linguistic information to be replaced, considering the assumption of conditional independence of CTC. Experimental results demonstrate that the proposed method enhances recognition accuracy.
Keyword	(in Japanese)	(See Japanese page)
	(in English)	End-to-End Speech Recognition / Streaming Speech Recognition / Language Mode / CTC / DRA / / /
Reference Info.		IEICE Tech. Rep., vol. 123, no. 88, SP2023-12, pp. 60-64, June 2023.
Paper #		SP2023-12
Date of Issue		2023-06-16 (SP)
ISSN		Online edition: ISSN 2432-6380
Copyright and reproduction		All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034)
Download PDF		SP2023-12

Conference Information
Committee	SP IPSJ-MUS IPSJ-SLP
Conference Date	2023-06-23 - 2023-06-24
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Paper Information
Registration To	SP
Conference Code	2023-06-SP-MUS-SLP
Language	Japanese
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Streaming End-to-End speech recognition using a CTC decoder with substituted linguistic information
Sub Title (in English)
Keyword(1)	End-to-End Speech Recognition
Keyword(2)	Streaming Speech Recognition
Keyword(3)	Language Mode
Keyword(4)	CTC
Keyword(5)	DRA
Keyword(6)
Keyword(7)
Keyword(8)
1st Author's Name	Tatsunari Takagi
1st Author's Affiliation	Toyohashi Univerdity of Technology (TUT)
2nd Author's Name	Atsunori Ogawa
2nd Author's Affiliation	NIPPON TELEGRAPH AND TELEPHONE CORPORATION (NTT)
3rd Author's Name	Norihide Kitaoka
3rd Author's Affiliation	Toyohashi Univerdity of Technology (TUT)
4th Author's Name	Yukoh Wakabayashi
4th Author's Affiliation	Toyohashi Univerdity of Technology (TUT)
5th Author's Name
5th Author's Affiliation	()
6th Author's Name
6th Author's Affiliation	()
7th Author's Name
7th Author's Affiliation	()
8th Author's Name
8th Author's Affiliation	()
9th Author's Name
9th Author's Affiliation	()
10th Author's Name
10th Author's Affiliation	()
11th Author's Name
11th Author's Affiliation	()
12th Author's Name
12th Author's Affiliation	()
13th Author's Name
13th Author's Affiliation	()
14th Author's Name
14th Author's Affiliation	()
15th Author's Name
15th Author's Affiliation	()
16th Author's Name
16th Author's Affiliation	()
17th Author's Name
17th Author's Affiliation	()
18th Author's Name
18th Author's Affiliation	()
19th Author's Name
19th Author's Affiliation	()
20th Author's Name
20th Author's Affiliation	()
Speaker	Author-1
Date Time	2023-06-23 13:50:00
Presentation Time	140 minutes
Registration for	SP
Paper #	SP2023-12
Volume (vol)	vol.123
Number (no)	no.88
Page	pp.60-64
#Pages	5
Date of Issue	2023-06-16 (SP)

[Return to Top Page]

[Return to IEICE Web Page]

The Institute of Electronics, Information and Communication Engineers (IEICE), Japan