Paper Abstract and Keywords |
Presentation |
2007-05-31 11:30
A study on multimodal speech recognition for spoken dialogue systems Shunsuke Takayama, Toshihide Matsuo, Koji Iwano, Sadaoki Furui (Tokyo Tech) SP2007-4 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
This paper describes speaker-independent multimodal speech recognition toward constructing multimodal spoken dialogue systems. In order to build a multimodal speech recognition system, an audio-visual speech database was first collected from 25 male speakers. In our system, a multi-stream HMM technique is used for integrating audio and visual information. We propose a multi-stream HMM construction method where audio-only and visual-only models are separately trained and then integrated at the state level. In this framework, the state tying structure of the target audio-visual model is inherited from the audio-only triphone HMM. Experimental results show that the proposed method is effective in various noise conditions. We also compared two visual features, optical-flow-based features and PCA(Principal Component Analysis)-based features, in our recognition framework. The results show that the optical-flow-based features yield better performance than the PCA-based features. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
multimodal speech recognition / dialogue speech / speaker independent / multi-stream HMM / model construction method / / / |
Reference Info. |
IEICE Tech. Rep., vol. 107, no. 77, SP2007-4, pp. 19-24, May 2007. |
Paper # |
SP2007-4 |
Date of Issue |
2007-05-24 (SP) |
ISSN |
Print edition: ISSN 0913-5685 Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
SP2007-4 |
Conference Information |
Committee |
SP |
Conference Date |
2007-05-31 - 2007-05-31 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
ATR |
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
etc. |
Paper Information |
Registration To |
SP |
Conference Code |
2007-05-SP |
Language |
Japanese |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
A study on multimodal speech recognition for spoken dialogue systems |
Sub Title (in English) |
|
Keyword(1) |
multimodal speech recognition |
Keyword(2) |
dialogue speech |
Keyword(3) |
speaker independent |
Keyword(4) |
multi-stream HMM |
Keyword(5) |
model construction method |
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Shunsuke Takayama |
1st Author's Affiliation |
Tokyo Institute of Technology (Tokyo Tech) |
2nd Author's Name |
Toshihide Matsuo |
2nd Author's Affiliation |
Tokyo Institute of Technology (Tokyo Tech) |
3rd Author's Name |
Koji Iwano |
3rd Author's Affiliation |
Tokyo Institute of Technology (Tokyo Tech) |
4th Author's Name |
Sadaoki Furui |
4th Author's Affiliation |
Tokyo Institute of Technology (Tokyo Tech) |
5th Author's Name |
|
5th Author's Affiliation |
() |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
Speaker |
Author-3 |
Date Time |
2007-05-31 11:30:00 |
Presentation Time |
30 minutes |
Registration for |
SP |
Paper # |
SP2007-4 |
Volume (vol) |
vol.107 |
Number (no) |
no.77 |
Page |
pp.19-24 |
#Pages |
6 |
Date of Issue |
2007-05-24 (SP) |
|