Print edition: ISSN 0913-5685 Online edition: ISSN 2432-6380
[TOP] | [2012] | [2013] | [2014] | [2015] | [2016] | [2017] | [2018] | [Japanese] / [English]
Acoustic data-driven pronunciation lexicon for non-native speech recognition
Satoshi Tsujioka (NAIST), Liang Lu (University of Edinburgh), Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura (NAIST)
pp. 1 - 6
A Spoken term detection method matching at a frame level
Ryota Konno, Kazunori Kojima (IPU), Shi-wook Lee (AIST), Kazuyo Tanaka (Univ. of Tsukuba), Yoshiaki Itoh (IPU)
pp. 7 - 12
A study on discriminative approach for estimation of the divergence between distributions and its application to language identification
Yosuke Kashiwagi, Congying Zhang, Daisuke Saito, Nobuaki Minematsu (Tokyo Univ.)
pp. 13 - 18
Sequence Discriminative Training for Low-Rank Deep Neural Networks
Yuuki Tachioka (Mitsubishi Electric), Shinji Watanabe, Jonathan Le Roux, John Hershey (MERL)
pp. 19 - 24
A Feature-Space Adaptation Technique using Regression Tree-based Multiple Transformation Matrices
Hiroki Kanagawa, Yuuki Tachioka (Mitsubishi Electric Corp.), Shinji Watanabe (MERL), Jun Ishii (Mitsubishi Electric Corp.)
pp. 25 - 30
Experimental evaluation of network size effect in speaker adaptive trained DNNs embedding linear transformation networks
Tsubasa Ochiai (Doshisha Univ./NICT), Shigeki Matsuda (Doshisha Univ.), Hideyuki Watanabe, Xugang Lu, Hisashi Kawai (NICT), Shigeru Katagiri (Doshisha Univ.)
pp. 31 - 36
Speaker Adaptation Technique for Speech Recognition using a Feature Augmentation Framework
Hiroshi Fujimura, Takashi Masuko (TOSHIBA)
pp. 37 - 42
Spoken Language Identification based on Language Modeling of Tandem-MLP Features
Ryo Masumura, Taichi Asami, Hirokazu Masataki, Sumitaka Sakauchi (NTT)
pp. 43 - 48
Multiple Feed-forward Deep Neural Networks for Statistical Parametric Speech Synthesis
Shinji Takaki (NII), SangJin Kim (Naver Labs), Junichi Yamagishi (NII), JongJin Kim (Naver Labs)
pp. 49 - 54
[Invited Talk]
Image feature extraction and transfer learning using deep convolutional neural networks
Hideki Nakayama (Univ. of Tokyo)
pp. 55 - 59
[Invited Talk]
Aspects of feature extraction in DNN acoustic models
Takuya Yoshioka, Marc Delcroix, Masakiyo Fujimoto, Tomohiro Nakatani (NTT)
pp. 61 - 65
A study on effectiveness of pop noise for speaker verification
Shiori Nakano, Ryosuke Nakanishi, Sayaka Shiota, Hitoshi Kiya (Tokyo Metro Univ.)
pp. 67 - 72
Voice liveness detection based on frequency characteristics for speaker verification
Sayaka Shiota (Tokyo Metro. Univ.), Fernando Villaviencio, Junichi Yamagishi, Nobutaka Ono, Isao Echizen (NII), Tomoko Matsui (ISM)
pp. 73 - 78
Investigation of privacy-preserving sounds to degrade automatic speaker verification performance
Kei Hashimoto (NITECH), Junichi Yamagishi, Isao Echizen (NII)
pp. 79 - 84
Note: Each article is a technical report without peer review, and its polished version will be published elsewhere.