Thu, Dec 20 AM 10:20 - 12:00 |
(1) SP |
10:20-10:45 |
Automatic Vocabulary Adaptation for Speech Recognition based on Semantic Similarity and Confidence Measure |
Shoko Yamahata, Yoshikazu Yamaguchi, Atsunori Ogawa, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi (NTT) |
(2) |
10:45-11:10 |
|
(3) |
11:10-11:35 |
|
(4) |
11:35-12:00 |
|
Thu, Dec 20 PM 13:29 - 14:29 |
(5) SP |
13:29-14:29 |
[Invited Talk]
Towards Integrated Processing of Speech and Image Information |
Yasuo Ariki (Kobe University) |
|
14:29-14:44 |
Break ( 15 min. ) |
Thu, Dec 20 PM 14:45 - 15:45 |
(6) SP |
14:45-15:45 |
[Invited Talk]
Making A Technology Seem Natural |
Eric Chang (Microsoft Research Asia) |
|
15:45-16:00 |
Break ( 15 min. ) |
Thu, Dec 20 PM 16:00 - 17:40 |
(7) |
16:00-16:25 |
|
(8) SP |
16:25-16:50 |
Recent efforts for high-performance multi-modal speech recognition |
Satoshi Tamura, Peng Shen, Hiroya Okuda, Naoya Ukai, Takuya Kawasaki, Takumi Seko, Satoru Hayamizu (Gifu Univ.) |
(9) |
16:50-17:15 |
|
(10) |
17:15-17:40 |
|
Fri, Dec 21 AM 09:00 - 10:40 |
(11) SP |
09:00-09:25 |
Normalization of EMA data
-- tongue movement during articulation of consonant clusters -- |
Seiya Funatsu (Prefectural Univ. of Hiroshima), Masako Fujimoto (NINJAL) |
(12) SP |
09:25-09:50 |
Fundamental frequency estimation combining air conducted speech with bone conducted speech |
Kosuke Osa, Tetsuya Shimamura (Saitama Univ.) |
(13) SP |
09:50-10:15 |
Relative amplitude between consonant and vowel of Bone Conducted speech |
Tatsuya Kato, Tetsuya Shimamura (Saitama Univ.) |
(14) SP |
10:15-10:40 |
Interpolation of unlearned position based on local regression for single-channel talker localization using acoustic transfer function |
Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) |
|
10:40-10:55 |
Break ( 15 min. ) |
Fri, Dec 21 AM 10:55 - 11:45 |
(15) |
10:55-11:20 |
|
(16) |
11:20-11:45 |
|
Fri, Dec 21 PM 13:00 - 14:00 |
(17) |
13:00-14:00 |
|
|
- |
Break |
Fri, Dec 21 PM 14:15 - 15:05 |
(18) |
14:15-14:40 |
|
(19) SP |
14:40-15:05 |
Reduction of cross spectrum for feature-domain sound source separation |
Atsushi Ando (Nagoya Univ.), Kenta Niwa (NTT), Norihide Kitaoka, Kazuya Takeda (Nagoya Univ.) |
Fri, Dec 21 PM 15:05 - 15:10 |
|
- |
|
|
15:10-15:25 |
Break ( 15 min. ) |
Fri, Dec 21 PM 15:25 - 16:55 |
(20) SP |
15:25-16:55 |
Syllable nucleus detection using waveform envelopes and modeling of the word acquisition process using word structures and syllable nuclei |
Yousuke Ozaki, Nobuaki Minematsu, Keikichi Hirose (The Univ. of Tokyo), Donna Erickson (Showa Univ. of Music) |
(21) SP |
15:25-16:55 |
Sparse Coding-Based Voice Conversion from Lip Information |
Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) |
(22) |
15:25-16:55 |
|
(23) |
15:25-16:55 |
|
(24) |
15:25-16:55 |
|
(25) |
15:25-16:55 |
|
(26) SP |
15:25-16:55 |
Two-step Correction of the Speech Recognition Result based on Syntax and Semantics |
Ryohei Nakatani, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) |
(27) |
15:25-16:55 |
|
(28) SP |
15:25-16:55 |
Automatic Speech Translation System Selecting Target Language by Direction of Arrival Information |
Masanori Tsujikawa, Koji Okabe, Ken Hanazawa (NEC) |
(29) |
15:25-16:55 |
|
(30) |
15:25-16:55 |
|