Paper Abstract and Keywords |
Presentation |
2019-03-15 13:30
[Poster Presentation]
Data augmentation using multiple databases for end-to-end dysarthric speech recognition Yuki Takashima, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) EA2018-156 SIP2018-162 SP2018-118 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
We present in this paper an end-to-end speech recognition system for a Japanese person with an articulation disorder resulting from athetoid cerebral palsy. The movements of such speakers are limited by their athetoid symptoms, and their utterances are often unstable or unclear, which makes it difficult for them to communicate. Therefore, the performance of automatic speech recognition (ASR) systems for people with an articulation disorder degrades significantly. Recently, deep learning approaches to speech recognition have seen much progress. These techniques require a large amount of training data, however, the amount of data from people with articulation disorder is limited due to their athetoid symptoms. This paper proposes a data augmentation method using not only the speech data of a Japanese person with an articulation disorder but also the speech data of a physically unimpaired Japanese person and a non-Japanese person with an articulation disorder. We employ an end-to-end ASR model based on the listen, attend and spell (LAS) model which has an acoustic module and a language module. In our proposed model, the acoustic module is shared between people with dysarthria, and a language module is assigned to each language regardless of dysarthria. The effectiveness of this approach was confirmed through word-recognition experiments where our proposed method outperformed a method based on the conventional LAS model. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
Speech recognition / Multilingual / Assistive technology / End-to-end model / Dysarthria / / / |
Reference Info. |
IEICE Tech. Rep., vol. 118, no. 497, SP2018-118, pp. 335-340, March 2019. |
Paper # |
SP2018-118 |
Date of Issue |
2019-03-07 (EA, SIP, SP) |
ISSN |
Print edition: ISSN 0913-5685 Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
EA2018-156 SIP2018-162 SP2018-118 |
Conference Information |
Committee |
EA SIP SP |
Conference Date |
2019-03-14 - 2019-03-15 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
i+Land nagasaki (Nagasaki-shi) |
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
Engineering/Electro Acoustics, Signal Processing, Speech, and Related Topics |
Paper Information |
Registration To |
SP |
Conference Code |
2019-03-EA-SIP-SP |
Language |
Japanese |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Data augmentation using multiple databases for end-to-end dysarthric speech recognition |
Sub Title (in English) |
|
Keyword(1) |
Speech recognition |
Keyword(2) |
Multilingual |
Keyword(3) |
Assistive technology |
Keyword(4) |
End-to-end model |
Keyword(5) |
Dysarthria |
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Yuki Takashima |
1st Author's Affiliation |
Kobe University (Kobe Univ.) |
2nd Author's Name |
Tetsuya Takiguchi |
2nd Author's Affiliation |
Kobe University (Kobe Univ.) |
3rd Author's Name |
Yasuo Ariki |
3rd Author's Affiliation |
Kobe University (Kobe Univ.) |
4th Author's Name |
|
4th Author's Affiliation |
() |
5th Author's Name |
|
5th Author's Affiliation |
() |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
21st Author's Name |
|
21st Author's Affiliation |
() |
22nd Author's Name |
|
22nd Author's Affiliation |
() |
23rd Author's Name |
|
23rd Author's Affiliation |
() |
24th Author's Name |
|
24th Author's Affiliation |
() |
25th Author's Name |
|
25th Author's Affiliation |
() |
26th Author's Name |
/ / |
26th Author's Affiliation |
()
() |
27th Author's Name |
/ / |
27th Author's Affiliation |
()
() |
28th Author's Name |
/ / |
28th Author's Affiliation |
()
() |
29th Author's Name |
/ / |
29th Author's Affiliation |
()
() |
30th Author's Name |
/ / |
30th Author's Affiliation |
()
() |
31st Author's Name |
/ / |
31st Author's Affiliation |
()
() |
32nd Author's Name |
/ / |
32nd Author's Affiliation |
()
() |
33rd Author's Name |
/ / |
33rd Author's Affiliation |
()
() |
34th Author's Name |
/ / |
34th Author's Affiliation |
()
() |
35th Author's Name |
/ / |
35th Author's Affiliation |
()
() |
36th Author's Name |
/ / |
36th Author's Affiliation |
()
() |
Speaker |
Author-1 |
Date Time |
2019-03-15 13:30:00 |
Presentation Time |
90 minutes |
Registration for |
SP |
Paper # |
EA2018-156, SIP2018-162, SP2018-118 |
Volume (vol) |
vol.118 |
Number (no) |
no.495(EA), no.496(SIP), no.497(SP) |
Page |
pp.335-340 |
#Pages |
6 |
Date of Issue |
2019-03-07 (EA, SIP, SP) |