IEICE Technical Report

Print edition: ISSN 0913-5685      Online edition: ISSN 2432-6380

Volume 114, Number 52

Speech

Workshop Date : 2014-05-24 - 2014-05-25 / Issue Date : 2014-05-17

[PREV] [NEXT]

[TOP] | [2011] | [2012] | [2013] | [2014] | [2015] | [2016] | [2017] | [Japanese] / [English]

[PROGRAM] [BULK PDF DOWNLOAD]


Table of contents

SP2014-1
"Ongaku" Symposium 2014: The 2nd Symposium on Any Topics Related to Acoustics, Audition and Natural Language
Hirokazu Kameoka (Univ. of Tokyo/NTT), Eriko Aiba (UEC), Yasunori Ohishi (NTT), Tetsuro Kitahara (Nihon Univ.), Tatsuya Kitamura (Konan Univ.), Shoei Sato (NHK), Masahito Togami (Hitachi), Tomoki Toda (NAIST), Kazuyoshi Yoshii (Kyoto Univ.)
pp. 1 - 3

SP2014-2
[Invited Talk] Speaker adaptation technologies for speech synthesis and its application to assistive technology
Junichi Yamagishi (NII)
pp. 5 - 6

SP2014-3
[Invited Talk] Infinite data analysis and Bayesian nonparametrics for audio signal processing
Masahiro Nakano (NTT)
pp. 13 - 18

SP2014-4
[Invited Talk] From multimodal spatial hearing to engineering applications to cope with severe disasters -- Our recent research restuls on spatial acoustic information sciences --
Yo-iti Suzuki, Shuichi Sakamoto (Tohoku Univ.)
pp. 19 - 20

SP2014-5
[Invited Talk] Behavioral neurosciences of vocal control and learning -- using the songbird as a model system --
Ryosuke O. Tachibana (Univ. of Tokyo)
pp. 29 - 34

SP2014-6
[Invited Talk] Machine Translation -- Why couldn't we do it? Why are we starting to be able to now? --
Graham Neubig (NAIST)
pp. 35 - 38

SP2014-7
[Invited Talk] Applications and Advances of Deep Learning for Automatic Speech Recognition
Yotaro Kubo (Amazon)
pp. 39 - 44

SP2014-8
[Invited Talk] R&D of Music Information Retrieval Technology and Issues for its Deployment to Practical Applications
Keiichiro Hoashi (KDDI Labs)
p. 45

SP2014-9
[Invited Talk] What Higher-Order Statistics Tell Us? -- Acoustic Signal Processing Based on Unsupervised Learning --
Hiroshi Saruwatari (Univ. of Tokyo)
pp. 47 - 52

SP2014-10
A Consideration of Evaluation Measurements in Spoken Term Detection
Satoshi Oshima, Yoshiaki Itoh (Iwate Prefectural Univ.)
pp. 117 - 121

SP2014-11
Robustness of Speaker Identification Using Pseudo Pitch Synchronized Phase Information
Yuta Kawakami, Longbiao Wang (Nagaoka Univ. of Tech.), Atsuhiko Kai (Shizuoka Univ.), Seiichi Nakagawa (Toyohashi Univ. of Tech.)
pp. 123 - 126

SP2014-12
Visualization of World Englishes pronunciations from a speaker's self-centered viewpoint using attributes of accent, gender, and age
Yuji Kawase, Nobuaki Minematsu, Daisuke Saito, Keikichi Hirose (UTokyo), Han-Ping Shen (NCKU)
pp. 127 - 132

SP2014-13
Native language recognition using machine learning
Ryota Sakagami, Kouki Takeshita, Longbiao Wang, Masahiro Iwahashi (Nagaoka Univ. of Tech)
pp. 139 - 141

SP2014-14
Language recognition in reverberant environments
Kouki Takeshita, Ryota Sakagami, Longbiao Wang, Masahiro Iwahashi (Nagaoka Univ. of Tech.)
pp. 143 - 145

SP2014-15
Discriminative training of acoustic models for system combination
Yuuki Tachioka (Mitsubishi Electric), Shinji Watanabe, Jonathan Le Roux, John R. Hershey (MERL)
pp. 147 - 152

SP2014-16
Distant-talking Speech Recognition with Asynchronous Speech Recording
Shunta Teraoka, Yuma Ueda (Shizuoka Univ.), Longbiao Wang (Nagaoka Univ. of Tech.), Atsuhiko Kai, Taku Fukushima (Shizuoka Univ.)
pp. 153 - 157

SP2014-17
[研究紹介] A spectrogram-patch-input DNN model for detection and classification of acoustic events robust to speech overlapping scenarios
Miquel Espi, Masakiyo Fujimoto, Yotaro Kubo, Tomohiro Nakatani (NTT)
pp. 171 - 176

SP2014-18
Development of environmental sound collection system using smart devices based on crowd-sourcing approach
Sunao Hara, Akinori Kasai, Masanobu Abe (Okayama Univ.), Noboru Sonehara (NII)
pp. 177 - 180

SP2014-19
ROCKON:Environmental sound collection and recognition system using smartphones
Minori Matsuyama, Takahiko Tsuda, Ryuichi Nisimura, Hideki Kawahara (Wakayama Univ), Junnosuke Yamada (NTT), Toshio Irino (Wakayama Univ)
pp. 181 - 186

SP2014-20
Underdetermined Blind Separation of Moving Sources Based on Probabilistic Modeling
Takuya Higuchi, Norihiro Takamune, Tomohiko Nakamura (Univ. of Tokyo), Hirokazu Kameoka (Univ. of Tokyo/NTT)
pp. 211 - 216

SP2014-21
Psychometric functions for across-frequency gap detection
Yousuke Kikuchi, Takako Mitsudo, Nobuyuki Hirose, Shuji Mori (Kyushu Univ.)
pp. 217 - 221

SP2014-22
Deriving the Salience Level of a Target Sound using a Tapping Technique Method
Shunsuke Kidani, Hsin-I Liao, Makoto Yoneya, Makio Kashino, Shigeto Furukawa (NTT)
pp. 223 - 226

SP2014-23
Perception of stop consonants at the beginning of binaurally fused words
Hitomi Kondo, Yousuke Kikuchi, Takako Mitsudo, Nobuyuki Hirose, Shuji Mori (Kyushu Univ.)
pp. 227 - 232

SP2014-24
Effect of interaural time difference for localization of spatially segregated sound
Daisuke Morikawa (JAIST)
pp. 233 - 235

SP2014-25
Acquisition and retention of perceptual cue for size judgment using whispered speech
Koudai Yamamoto, Toshio Irino, Ryuichi Nisimura, Hideki Kawahara (Wakayama Univ.)
pp. 237 - 242

SP2014-26
Analysis of the Relationship between Pitch and Formant Frequencies in Voice Register Transition
Yasufumi Uezu, Takahiro Furukawa, Tokihiko Kaburagi (Kyushu Univ.)
pp. 297 - 302

SP2014-27
Statistical bandwidth extension using sub-band basis spectrum model
Yamato Ohtani, Masatsune Tamura, Masahiro Morita, Masami Akamine (Toshiba)
pp. 303 - 308

SP2014-28
Text-to-speech prosody synthesis based on probabilistic model for F0 contour
Kento Kadowaki, Tatsuma Ishihara, Nobukatsu Hojo (Univ. of Tokyo), Hirokazu Kameoka (Univ. of Tokyo/NTT)
pp. 309 - 314

SP2014-29
Evaluation of singing voice similarity based on "acoustic singing-structure"
Shun Kojima, Takeshi Saitou, Masato Miyoshi (Kanazawa Univ.)
pp. 315 - 319

SP2014-30
Statistical approach to perceived age control of singing voice
Kazuhiro Kobayashi, Tomoki Toda (NAIST), Tomoyasu Nakano, Masataka Goto (AIST), Graham Neubig, Sakriani Sakti, Satoshi Nakamura (NAIST)
pp. 321 - 326

SP2014-31
A portable application for assistance of vocal sound training by overtone analysis
Iori Sugahara, Takayuki Itoh (Ochanomizu Univ)
pp. 327 - 329

SP2014-32
An Evaluation of a Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Prediction
Kou Tanaka, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura (NAIST)
pp. 331 - 336

SP2014-33
Design of voice-enabled web test system for eliminating users' impatience
Chihiro Tafuji, Ryuichi Nisimura, Hideki Kawahara, Toshio Irino (Wakayama Univ.)
pp. 337 - 342

SP2014-34
A joint restricted Boltzmann machine for dictionary learning in sparse-representation-based voice conversion
Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.)
pp. 343 - 348

SP2014-35
Speech waveform generation on subband domain
Nobuyuki Nishizawa, Tsuneo Kato (KDDI R&D Labs)
pp. 349 - 354

SP2014-36
A Kana Protocol Recommendation Method for Switch Input Speech Synthesis Systems
Fuming Fang, Takahiro Shinozaki, Takao Kobayashi (Tokyo Tech)
pp. 355 - 360

SP2014-37
Current situations and issues of open-source high-quality speech synthesis system WORLD
Masanori Morise (Univ. of Yamanashi)
pp. 361 - 366

SP2014-38
The Acoustic Feature of the Loudspeaker which used the Reinforced Corrugated Fibreboard for the Enclosure Material
Takuto Isoyama, Yukio Mori (Salesian Polytechnic), Yoshiaki Kiyama
pp. 367 - 370

SP2014-39
Spot-forming method by using two shotgun microphones
Motoyuki Suzuki, Takeshi Honjo (Osaka Inst. of Tech.)
pp. 371 - 376

SP2014-40
Signal processing of ultrasound for osteoporosis diagnosis -- Modeling, time domain analysis, and frequency domain analysis --
Yoshiki Nagatani (KCCT), Ryosuke O. Tachibana (Univ. of Tokyo)
pp. 377 - 382

SP2014-41
Modulation transfer function based robust method of voice activity detection for noisy reverberant environments -- Utilization of subband SNR estimation --
Shota Morita, Masashi Unoki (JAIST), Xugang Lu (NICT), Masato Akagi (JAIST)
pp. 383 - 388

SP2014-42
Systematic study on kawaii products (The seventeenth report) -- Basic study for Kawaii sound --
Michiko Ohkura, Ryo Kanno (Shibaura Inst. Tech.)
pp. 389 - 392

SP2014-43
The basic mechanisms for perception of simultaneity, stream segregation, and temporal order for auditory stimuli
Satoshi Okazaki, Makoto Ichikawa (Chiba Univ.)
pp. 393 - 395

SP2014-44
[研究紹介] Adaptive adjustment of local temporal structure in song of Bengalese finches
Ryosuke O. Tachibana, Neal A. Hessler, Kazuo Okanoya (Univ. of Tokyo)
pp. 407 - 410

SP2014-45
Modulation of the Temporal Dynamics of Microsaccades with the Presentation of Salient Sounds
Makoto Yoneya, Hsin-I Liao, Shunsuke Kidani, Shigeto Furukawa (NTT), Makio Kashino (NTT/Tokyo Tech)
pp. 411 - 414

Note: Each article is a technical report without peer review, and its polished version will be published elsewhere.


The Institute of Electronics, Information and Communication Engineers (IEICE), Japan