Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SP, IPSJ-SLP |
2012-12-20 10:20 |
Tokyo |
TITECH(Ookayama) |
Automatic Vocabulary Adaptation for Speech Recognition based on Semantic Similarity and Confidence Measure Shoko Yamahata, Yoshikazu Yamaguchi, Atsunori Ogawa, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi (NTT) SP2012-85 |
Out-Of-Vocabulary utterances are an unavoidable problem in speech recognition systems. And therefore, automatic vocabula... [more] |
SP2012-85 pp.1-6 |
SP, IPSJ-SLP |
2012-12-20 13:29 |
Tokyo |
TITECH(Ookayama) |
[Invited Talk]
Towards Integrated Processing of Speech and Image Information Yasuo Ariki (Kobe University) SP2012-86 |
In this paper, multimodal processing done by the author is described using and integrating vision and speech, as well as... [more] |
SP2012-86 pp.27-32 |
SP, IPSJ-SLP |
2012-12-20 14:45 |
Tokyo |
TITECH(Ookayama) |
[Invited Talk]
Making A Technology Seem Natural Eric Chang (Microsoft Research Asia) SP2012-87 |
Reading science fictions over the past one hundred years, one sees many seemingly impossible machines and services which... [more] |
SP2012-87 pp.33-34 |
SP, IPSJ-SLP |
2012-12-20 16:25 |
Tokyo |
TITECH(Ookayama) |
Recent efforts for high-performance multi-modal speech recognition Satoshi Tamura, Peng Shen, Hiroya Okuda, Naoya Ukai, Takuya Kawasaki, Takumi Seko, Satoru Hayamizu (Gifu Univ.) SP2012-88 |
Regarding Multi-Modal Automatic Speech Recognition (MMASR) which uses acoustic and lip/mouth information, this paper des... [more] |
SP2012-88 pp.41-46 |
SP, IPSJ-SLP |
2012-12-21 09:00 |
Tokyo |
TITECH(Ookayama) |
Normalization of EMA data
-- tongue movement during articulation of consonant clusters -- Seiya Funatsu (Prefectural Univ. of Hiroshima), Masako Fujimoto (NINJAL) SP2012-89 |
Tongue tip movement during articulation of non-native consonant clusters was investigated using an electromagnetic artic... [more] |
SP2012-89 pp.59-64 |
SP, IPSJ-SLP |
2012-12-21 09:25 |
Tokyo |
TITECH(Ookayama) |
Fundamental frequency estimation combining air conducted speech with bone conducted speech Kosuke Osa, Tetsuya Shimamura (Saitama Univ.) SP2012-90 |
[more] |
SP2012-90 pp.65-70 |
SP, IPSJ-SLP |
2012-12-21 09:50 |
Tokyo |
TITECH(Ookayama) |
Relative amplitude between consonant and vowel of Bone Conducted speech Tatsuya Kato, Tetsuya Shimamura (Saitama Univ.) SP2012-91 |
In highly noisy environments, bone conducted (BC) speech is utilized as a tool for speech communication.
This is becaus... [more] |
SP2012-91 pp.71-74 |
SP, IPSJ-SLP |
2012-12-21 10:15 |
Tokyo |
TITECH(Ookayama) |
Interpolation of unlearned position based on local regression for single-channel talker localization using acoustic transfer function Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) SP2012-92 |
This paper presents a sound source (talker) localization method using only a single microphone. In our previous work, we... [more] |
SP2012-92 pp.75-80 |
SP, IPSJ-SLP |
2012-12-21 14:40 |
Tokyo |
TITECH(Ookayama) |
Reduction of cross spectrum for feature-domain sound source separation Atsushi Ando (Nagoya Univ.), Kenta Niwa (NTT), Norihide Kitaoka, Kazuya Takeda (Nagoya Univ.) SP2012-93 |
Speech source separation is utilized for recognition of simultaneous speech. Conventional source separation methods, esp... [more] |
SP2012-93 pp.107-112 |
SP, IPSJ-SLP |
2012-12-21 15:25 |
Tokyo |
TITECH(Ookayama) |
Syllable nucleus detection using waveform envelopes and modeling of the word acquisition process using word structures and syllable nuclei Yousuke Ozaki, Nobuaki Minematsu, Keikichi Hirose (The Univ. of Tokyo), Donna Erickson (Showa Univ. of Music) SP2012-94 |
Simulation of language acquisition processes is an active research area in speech and computer science. Here, models and... [more] |
SP2012-94 pp.113-118 |
SP, IPSJ-SLP |
2012-12-21 15:25 |
Tokyo |
TITECH(Ookayama) |
Sparse Coding-Based Voice Conversion from Lip Information Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) SP2012-95 |
A technology to recognize speech content from lip motion is called visual speech recognition (VSR). VSRis an important c... [more] |
SP2012-95 pp.119-124 |
SP, IPSJ-SLP |
2012-12-21 15:25 |
Tokyo |
TITECH(Ookayama) |
Two-step Correction of the Speech Recognition Result based on Syntax and Semantics Ryohei Nakatani, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) SP2012-96 |
This paper presents the new method correcting speech recognition errors
base on long-distance context. As in the past, ... [more] |
SP2012-96 pp.149-154 |
SP, IPSJ-SLP |
2012-12-21 15:25 |
Tokyo |
TITECH(Ookayama) |
Automatic Speech Translation System Selecting Target Language by Direction of Arrival Information Masanori Tsujikawa, Koji Okabe, Ken Hanazawa (NEC) SP2012-97 |
An automatic speech translation system selecting target language by direction of arrival information is proposed. The pr... [more] |
SP2012-97 pp.161-165 |