Paper Abstract and Keywords |
Presentation |
2010-12-21 11:05
Evaluation of Successive Rapid Hypothesis Determination Algorithm for Continuous Word Recognition Hiroyuki Ohno (Nagoya Inst. of Tech.), Hiroshi Kojima (Nagoya Inst. of Tech/Hitachi Solutions, Ltd.), Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda (Nagoya Inst. of Tech.) NLC2010-21 SP2010-94 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
Minimizing response delay of speech recognition system and giving rapid feed backs are important properties for an intuitive, easy-to-use speech interfaces. Many studies has been conducted to improve the response delay, such as making progressive outputs while recognition process "after" the words are half-determined in the context. In order to achieve higher speed input responses, we have proposed an algorithm to determine the most likely hypothesis "before" the utterance ends. The method has been examined for isolated word recognition, and this paper extends it for continuous word recognition. Experimental evaluations were performed for tasks of various vocabulary size. The result at a small vocabulary task with 14 words has shown that our proposed algorithm can determine each word for about 0.053 second prior to the actual end of speech on average, without any degradation of recognition accuracy. Another result on a station names recognition task with vocabulary size of 8738 has shown that our proposed algorithm can determine each word for about 0.48 second on average after the actual end of speech. The comparison results on various acoustic models are also reported. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
Speech recognition / Search algorithm / Progressive output / Tree lexicon / Confidence measure / / / |
Reference Info. |
IEICE Tech. Rep., vol. 110, no. 357, SP2010-94, pp. 77-82, Dec. 2010. |
Paper # |
SP2010-94 |
Date of Issue |
2010-12-13 (NLC, SP) |
ISSN |
Print edition: ISSN 0913-5685 Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
NLC2010-21 SP2010-94 |
|