IEICE Technical Report

Print edition: ISSN 0913-5685      Online edition: ISSN 2432-6380

Volume 119, Number 440

Signal Processing

Workshop Date : 2020-03-02 - 2020-03-03 / Issue Date : 2020-02-24

[PREV] [NEXT]

[TOP] | [2016] | [2017] | [2018] | [2019] | [2020] | [2021] | [2022] | [Japanese] / [English]

[PROGRAM] [BULK PDF DOWNLOAD]


Table of contents

SIP2019-103
Investigation of neural speech rate conversion with multi-speaker WaveNet vocoder
Takuma Okamoto (NICT), Keisuke Matsubara (Kobe Univ./NICT), Tomoki Toda (Nagoya Univ./NICT), Yoshinori Shiga, Hisashi Kawai (NICT)
pp. 1 - 6

SIP2019-104
(See Japanese page.)
pp. 7 - 12

SIP2019-105
Multichannel NMF with Joint-Diagonalizable Constraint Based on Generalized Gaussian Distribution for Blind Source Separation
Keigo Kamo, Yuki Kubo, Norihiro Takamune (UTokyo), Daichi Kitamura (NIT Kagawa), Hiroshi Saruwatari (UTokyo), Yu Takahashi, Kazunobu Kondo (Yamaha)
pp. 13 - 19

SIP2019-106
Dimension reduction without multiplication in machine learning
Nobutaka Ono (TMU)
pp. 21 - 26

SIP2019-107
[Invited Talk] Target speech extraction in speech mixtures with SpeakerBeam
Marc Delcroix (NTT), Katerina Zmolikova (BUT), Keisuke Kinoshita, Tsubasa Ochiai, Tomohiro Nakatani, Shoko Araki (NTT)
pp. 27 - 28

SIP2019-108
Vulnerability investigation of speaker verification against black-box adversarial attacks
Hiroto Kai, Sayaka Shiota, Hitoshi Kiya (TMU)
pp. 29 - 33

SIP2019-109
Learning of Classification Models using Emotion-specific Soft Labels for Speech Emotion Recognition
Mayuko Ozawa, Keisuke Imoto, Ryosuke Yamanishi, Yoichi Yamashita (Ritsumeikan Univ.)
pp. 35 - 40

SIP2019-110
Japanese dialect speech classification using sequence-to-one neural networks
Ryo Imaizumi (TMU), Ryo Masumura (NTT), Sayaka Shiota, Hitoshi Kiya (TMU)
pp. 41 - 46

SIP2019-111
[Poster Presentation] Neural Voice Activity Detection using Multiple Auxiliary Networks
Ryo Masumura, Kiyoaki Matsui, Yuma Koizumi, Takanobu Oba (NTT)
pp. 47 - 52

SIP2019-112
Data augmentation for ASR system by using locally time-reversed speech -- Temporal inversion of feature sequence --
Takanori Ashihara, Tomohiro Tanaka, Takafumi Moriya, Ryo Masumura, Yusuke Shinohara, Makio Kashino (NTT)
pp. 53 - 58

SIP2019-113
Adaptation to Meeting Speech and Mitigation of Wraparound Speech for End-to-end Speech Recognition
Kazua Ouchi, Atsuhiko Kai (Shizuoka Univ.)
pp. 59 - 64

SIP2019-114
The Effectiveness of Additional Context in DNN-based Spontaneous Speech Synthesis
Yuki Yamashita, Tomoki Koriyama, Yuki Saito, Shinnosuke Takamichi (UTokyo), Yusuke Ijima, Ryo Masumura (NTT), Hiroshi Saruwatari (UTokyo)
pp. 65 - 70

SIP2019-115
Production and auditory features of speech sounds before and after vocal training using hearing impairment simulator, WHIS
Soichi Higashiyama, Hanako Yoshigi, Hideki Kawahara, Toshio Irino (Wakayama Univ.)
pp. 71 - 76

SIP2019-116
[Poster Presentation] A study on loudspeaker measurement with reflection cancellation for each frequency in a reflective environment
Akiyuki Moritani, Yutaka Kaneda (Tokyo Denki Univ.)
pp. 77 - 82

SIP2019-117
[Poster Presentation] Study on Error factors and improvement method in low frequency band on MUSIC sound source direction estimation
Kazuki Yuasa, Yutaka Kaneda (Tokyo Denki Univ)
pp. 83 - 88

SIP2019-118
[Poster Presentation] Development of Distributed Wireless Synchronous Recording System for Event Detection
Taku Kitajima, Kan Okubo, Norio Tagawa (TMU)
pp. 89 - 93

SIP2019-119
[Poster Presentation] Evaluation of spatial impression of stereo sound source conversion methods for headphone reproduction
Yui Ueno, Mitsunori Mizumachi (Kyutech), Toshiharu Horiuchi (KDDI Research, Inc.)
pp. 95 - 100

SIP2019-120
[Poster Presentation] Feature Analysis of Accuracy and Direction of Sound Image Localization Using Narrow-band Signal
Michika Yamada, Fumikazu Saze (TMU), Toshiharu Horiuchi (KDDI Research), Kan Okubo (TMU)
pp. 101 - 106

SIP2019-121
[Poster Presentation] Selective synthesis of sound field using directivity control and stereo width control
Toshiharu Horiuchi, Sumaru Niida (KDDI Research)
pp. 107 - 110

SIP2019-122
[Poster Presentation] Sway Angle Estimation for Jib Cranes with three microphones
Naoki Horie, Masayoshi Nakamoto, Toru Yamamoto (Hiroshima Univ.)
pp. 111 - 116

SIP2019-123
[Poster Presentation] Shadow Detection and Removal with CNN using Generative Adversarial Networks
Takahiro Nagae, Ryo Abiko, Takuro Yamaguchi, Masaaki Ikehara (Keio Univ.)
pp. 117 - 122

SIP2019-124
[Poster Presentation] A Robust Approach to Jointly-Sparse Signal Recovery Based on Minimax Concave Loss Function
Kyohei Suzuki, Masahiro Yukawa (Keio Univ.)
pp. 123 - 128

SIP2019-125
[Poster Presentation] [Poster presentation] A Study on application of Nonlinear IIR Filter to Nonlinear Acoustic Echo Canceller
Kenta Iwai (Ritsumeikan Univ.), Yoshinobu Kajikawa (Kansai Univ.)
pp. 129 - 134

SIP2019-126
[Poster Presentation] High-precision modeling of distortion stomp box by deep learning using spectral features
Kento Yoshimoto, Daichi Kitahara, Akira Hirabayashi (Ritsumeikan Univ.)
pp. 135 - 140

SIP2019-127
[Poster Presentation] Sensor placement allowing independent setting of estimation and candidate regions for field estimation based on Gaussian process
Tomoya Nishida, Natsuki Ueno, Shoichi Koyama, Hiroshi Saruwatari (Univ Tokyo)
pp. 141 - 146

SIP2019-128
[Poster Presentation] Restoration of clipped signal using oversampling based on differentiable and convex loss function
Natsuki Ueno, Shoichi Koyama, Hiroshi Saruwatari (Univ. Tokyo)
pp. 147 - 152

SIP2019-129
[Poster Presentation] Beam steering of portable parametric array loudspeaker using phased array technique with weighting function
Kyosuke Nakagawa, Yoshinobu Kajikawa (Kansai Univ.)
pp. 153 - 158

SIP2019-130
[Poster Presentation] Effective Sound Source Arrangement for Three Sound Source Localization Using Two Microphones
Yoshiki Kikuchi, Tomoyuki Ishiguro, Kenji Suyama (Tokyo Denki Univ.)
pp. 159 - 161

SIP2019-131
Iterative phase reconstruction to embed image into speech spectrogram
Arata Kawamura (Kyoto Sangyo Univ.)
pp. 163 - 168

SIP2019-132
A Pattern Recognition Method Using Secure Sparse Representations in L0 Norm Minimization
Takayuki Nakachi, Yitu Wang (NTT), Hitoshi Kiya (Tokyo Metro. Univ.)
pp. 169 - 174

SIP2019-133
Performance evaluation of distilling knowledge using encoder-decoder for CTC-based automatic speech recognition systems
Takafumi Moriya, Hiroshi Sato, Tomohiro Tanaka, Takanori Ashihara, Ryo Masumura, Yusuke Shinohara (NTT)
pp. 175 - 180

SIP2019-134
Dysarthric Speech Recognition Based on Deep Metric Learning
Yuki Takashima, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.)
pp. 181 - 186

SIP2019-135
[Fellow Memorial Lecture] Building Dictionary for Media Information
Kunio Kashino (NTT)
p. 187

SIP2019-136
[Poster Presentation] Implementation of a high-accuracy method for automatic fluency scoring of spontaneous English utterances by Japanese learners
Ayano Yasukagawa, Shintaro Ando, Eisuke Konno, Zhenchao Lin, Yusuke Inoue, Daisuke Saito, Nobuaki Minematsu (UTokyo), Kazuya Saito (UCL)
pp. 189 - 194

SIP2019-137
[Poster Presentation] Initial analysis of oral reading skills obtained from large scale subjective evaluation
Takuya Ozuru (Univ. of Tokyo), Yusuke Ijima (NTT), Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo)
pp. 195 - 200

SIP2019-138
[Poster Presentation] Automatic estimation of prosodic control made in English utterances using DNN-based acoustic models trained with prosodic features and labels
Yang Shen, Shintarou Ando, Nobuaki Minematsu, Daisuke Saito (UTokyo), Satoshi Kobashikawa (NTT)
pp. 201 - 206

SIP2019-139
[Poster Presentation] An Educational Study on Prosodic Symbols and Their Acoustic Realization Using Japanese End-to-end Speech Synthesis
Fuki Yoshizawa (UTokyo), Tadashi Kumano (NHK), Nobuaki Minematsu (UTokyo), Kiyoshi Kurihara (NHK)
pp. 207 - 212

SIP2019-140
Evaluation of vocal personality and expression for speech synthesized by non-parallel voice conversion with narrative speech
Ryotaro Nagase, Keisuke Imoto, Ryosuke Yamanishi, Yoichi Yamashita (Ritsumeikan Univ.)
pp. 213 - 218

SIP2019-141
Cross-Lingual Voice Conversion using Cyclic Variational Auto-encoder
Hikaru Nakatani, Patrick Lumban Tobing, Kazuya Takeda, Tomoki Toda (Nagoya Univ.)
pp. 219 - 224

SIP2019-142
Semi-supervised Self-produced Speech Enhancement and Suppression Based on Joint Source Modeling of Air- and Body-conducted Signals Using Variational Autoencoder
Shogo Seki, Moe Takada, Kazuya Takeda, Tomoki Toda (Nagoya Univ.)
pp. 225 - 230

SIP2019-143
A Study for HMM-based embedded speech synthesis using a large-scale speech corpus
Nobuyuki Nishizawa, Tomohiro Obara, Hiromi Ishizaki (KDDI Research, Inc.)
pp. 231 - 236

SIP2019-144
LARGE-CONTEXT POINTER-GENERATOR NETWORKS FOR SPOKEN-TO-WRITTEN STYLE CONVERSION
Mana Ihori, Akihiko Takashima, Ryo Masumura (NTT)
pp. 237 - 242

SIP2019-145
A study on step size control method for pre-estimating adaptive filter in feedback type active noise control systems
Kensaku Fujii (Kodaway Lab.), Mitsuji Muneyasu (Kansai Univ.), Yoshifumi Chisaki (CIT)
pp. 243 - 250

SIP2019-146
[Poster Presentation] Design of automatic soundscape generation based on image object detection
Yoshifumi Chisaki (CIT), Toshiharu Horiuchi (KDDI Research, Inc.)
pp. 251 - 254

SIP2019-147
[Poster Presentation] A study on reverberation time estimation based on regression error
Yohei Iiyama, Yutaka Kaneda (Tokyo Denki Univ.)
pp. 255 - 260

SIP2019-148
[Poster Presentation] High Resolution Acoustic Analysis for Classification of Bell Crickets
Hideto Otsuka, Fumikazu Saze, Kan Okubo (TMU)
pp. 261 - 265

SIP2019-149
[Poster Presentation] Basic Examination on Omni-Directional Sound Source Using Facing Ultrasonic Sensor Arrays
Kyoka Okamoto, Kan Okubo (TMU)
pp. 267 - 271

SIP2019-150
[Poster Presentation] Study on method for calculating loudness of stationary sound using auditory filterbank
Takuto Isoyama, Shunsuke Kidani, Masashi Unoki (JAIST)
pp. 273 - 278

SIP2019-151
[Poster Presentation] Time-domain audio source separation using multiresolution deep layered analysis based on simultaneous learning of neural networks and wavelet basis functions
Shihori Kozuka, Tomohiko Nakamura, Hiroshi Saruwatari (UTokyo)
pp. 279 - 284

SIP2019-152
[Poster Presentation] Bed ANC System with AF-VS for Reducing the Noise in ICU
Reo Maeda, Yoshinobu Kajikawa (Kansai Univ.), Liu Lichuan, Bi Congzhi (NIU)
pp. 285 - 288

SIP2019-153
[Poster Presentation] Multi-scale graph construction method for graph signal coding with SPIHT algorithm
Kosuke Abe, Yuichi Tanaka (TUAT)
pp. 289 - 294

SIP2019-154
[Poster Presentation] A Comparison of Language Models for a Design of Reduced Phoneme Set
Shuji Komeiji, Toshihisa Tanaka (TUAT), Koichi Shinoda (titech)
pp. 295 - 300

SIP2019-155
[Poster Presentation] Decoding of Non-Isochronous Rhythms Imagery from EEG Using Convolutional Neural Network
Naoki Yoshimura, Toshihisa Tanaka (TUAT)
pp. 301 - 306

SIP2019-156
[Poster Presentation] Performance Evaluation of Convolutional-Sparse-Coded Dynamic Mode Decomposition in River Groynes Model Experiment
Yusuke Arai, Yuhei Kaneko, Shogo Muramatsu, Hiroyasu Yasuda, Kiyoshi Hayasaka, Yu Otake (Niigata Univ.)
pp. 307 - 312

SIP2019-157
[Poster Presentation] EEG-Based Estimation of Attentional Direction while Simultaneously Listening to Music and Speech
Ryosuke Matsui, Toshihisa Tanaka (TUAT)
pp. 313 - 318

SIP2019-158
[Poster Presentation] Comparison of Neural Network Models for Detection of Spatiotemporal Abnormal Intervals in Epileptic EEG
Kosuke Fukumori (TUAT), Noboru Yoshida (Juntendo Univ.), Toshihisa Tanaka (TUAT)
pp. 319 - 323

SIP2019-159
[Poster Presentation] EpiNet: Convolutional Neural Network for Epileptic Seizure Localization from Interictal Intracranial EEG
Kosuke Mori, Kosuke Fukumori, Toshihisa Tanaka (TUAT), Yasushi Iimura, Takumi Mitsuhashi, Hidenori Sugano (Juntendo Univ.)
pp. 325 - 330

SIP2019-160
[Invited Talk] How to incorporate spatial model in deep learning based speech source separation?
Masahito Togami (LINE)
pp. 331 - 336

SIP2019-161
Real-time visualization of reverberation time using frequency domain variants of velvet noise
Hideki Kawahara (Wakayama Univ.), Ken-Ichi Sakakibara (Health Science Univ. Hokkaido), Mitsunori Mizumachi (Kyushu Inst. Tech.), Masanori Morise (Meiji Univ.), Hideki Banno (Meijo Univ.)
pp. 337 - 342

SIP2019-162
An objective value for dialogue level auto adjustment on the production of second audio program
Hiroki Kubo, Satoshi Oode (NHK)
pp. 343 - 348

SIP2019-163
Application of frequency-domain variant of velvet noise to the measurement of auditory effects on the fundamental frequency of sustained voicing
Hideki Kawahara (Wakayama Univ.), Ken-Ichi Sakakibara (Health Science Univ. of Hoakkaido), Minoru Tsuzaki (KCU), Toshie Matsui (TIT), Masanori Morise (Meiji Univ.), Toshio Irino (Wakayama Univ.)
pp. 349 - 354

SIP2019-164
Comparison of feature parameters from original speech, LPC-based estimated speech and residual speech for speaker identification
Seiichi Nakagawa, Kohto Hanai, Kazumasa Yamamoto ()
pp. 355 - 360

SIP2019-165
(See Japanese page.)
pp. 361 - 366

SIP2019-166
Multiresolutional graph learning
Koki Yamada, Yuichi Tanaka (TUAT)
pp. 367 - 372

SIP2019-167
Mixed norm minimization based on epigraphical projection
Seisuke Kyochi (The Univ. of Kitakyushu), Shunsuke Ono (Tokyo Tech)
pp. 373 - 378

SIP2019-168
Rice Field Heat Map based on The Gaze of Experts in Growth Diagnosis
Hidenori Watanabe (Kumamoto IRI)
pp. 379 - 384

SIP2019-169
A Portscan Detection Based on Low-rankness of Destination Port Matrices
Hiroki Nousou, Masao Yamagishi, Isao Yamada (Tokyo Tech)
pp. 385 - 390

Note: Each article is a technical report without peer review, and its polished version will be published elsewhere.


The Institute of Electronics, Information and Communication Engineers (IEICE), Japan