|
Chair |
|
Tomoki Toda (Nagoya Univ.) |
Secretary |
|
Atsushi Ando (NTT), Kei Hashimoto (Nagoya Inst. of Tech.) |
Assistant |
|
Ryo Aihara (Mitsubishi Electric), Daisuke Saito (Univ. of Tokyo) |
|
Conference Date |
Fri, Jun 23, 2023 09:20 - 17:40
Sat, Jun 24, 2023 09:30 - 18:00 |
Topics |
|
Conference Place |
|
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Registration Fee |
This workshop will be held as the IEICE workshop in fully electronic publishing. Registration fee will be necessary except the speakers and participants other than the participants to workshop(s) in non-electronic publishing. See the registration fee page. We request the registration fee or presentation fee to participants who will attend the workshop(s) on SP. |
Fri, Jun 23 AM 09:20 - 09:30 |
(1) |
09:20-09:30 |
|
Fri, Jun 23 AM 09:30 - 12:00 |
(2) |
09:30-10:40 |
|
|
10:40-10:50 |
Break ( 10 min. ) |
(3) |
10:50-12:00 |
|
|
12:00-12:10 |
Break ( 10 min. ) |
Fri, Jun 23 PM 12:10 - 12:50 |
|
- |
|
|
12:50-13:50 |
Break ( 60 min. ) |
Fri, Jun 23 PM 13:50 - 16:10 |
(4) |
13:50-16:10 |
|
(5) |
13:50-16:10 |
|
(6) |
13:50-16:10 |
|
(7) |
13:50-16:10 |
|
(8) |
13:50-16:10 |
|
(9) |
13:50-16:10 |
|
(10) |
13:50-16:10 |
|
(11) |
13:50-16:10 |
|
(12) |
13:50-16:10 |
|
(13) |
13:50-16:10 |
|
(14) |
13:50-16:10 |
|
(15) |
13:50-16:10 |
|
(16) |
13:50-16:10 |
|
(17) |
13:50-16:10 |
|
(18) |
13:50-16:10 |
|
(19) |
13:50-16:10 |
|
(20) SP |
13:50-16:10 |
[Poster Presentation]
Research on Fundamental Frequency Estimation of Instrument Sounds using Asynohronous Detection Technique SP2023-1 |
Nichika Mitsubori, Tkuma Yamakawa, Kenichiro Miwa (Salesian Polytechnic) |
(21) SP |
13:50-16:10 |
Impression Conversion of Speech for Unknown Speakers Using FaderNet SP2023-2 |
Saki Kugimoto, Toru Nakashika (UEC) |
(22) SP |
13:50-16:10 |
Feature Representation of Japanese Pitch Accent and its Perceptual Adequacy
-- Fundamental Study for Application to Japanese Speech Education -- SP2023-3 |
Ikuyo Masuda-Katsuse (Kindai Univ.) |
(23) SP |
13:50-16:10 |
Data Augmentation by Synthesised Voice for Deep Learning-based A Cappella Separation SP2023-4 |
Kyoka Kazama (TMU), Yuma Kinoshita (Tokai Univ.), Natsuki Ueno, Nobutaka Ono (TMU) |
(24) SP |
13:50-16:10 |
[Poster Presentation]
MS-Harmonic-Net++ vs SiFi-GAN: Comparison of fundamental frequency controllable fast neural waveform generative models. SP2023-5 |
Sota Shimizu (Kobe Univ./NICT), Takuma Okamoto (NICT), Ryoichi Takashima (Kobe Univ.), Yamato Ohtani (NICT), Tetsuya Takiguchi (Kobe Univ.), Tomoki Toda (Nagoya Univ./NICT), Hisashi Kawai (NICT) |
(25) SP |
13:50-16:10 |
[Poster Presentation]
Examination of the vocal tract control for pitch changes in opera singing using real-time MRI SP2023-6 |
Natsuki Toda, Hironori Takemoto (CIT), Jun Takahashi (OUA) |
(26) SP |
13:50-16:10 |
[Poster Presentation]
Opera-singing voice synthesis using Diff-SVC SP2023-7 |
Aoto Sugahara (Kobe Univ.), Soma Kishimoto, Yuji Adachi, Kiyoto Tai (MEC Company Ltd.), Ryoichi Takashima, Testuya Takiguchi (Kobe Univ.) |
(27) SP |
13:50-16:10 |
On phase function design of extended time-stretched pulse based on cascaded all-pass filters SP2023-8 |
Hideki Kawahara (Wakayama Univ.), Kohei Yatabe (Tokyo Univ. Agri. Tech.) |
(28) SP |
13:50-16:10 |
Speech Emotion Recognition based on Emotional Label Sequence Estimation Considering Phoneme Class Attribute SP2023-9 |
Ryotaro Nagase, Takahiro Fukumori, Yoichi Yamashita (Ritsumeikan Univ.) |
(29) SP |
13:50-16:10 |
[Poster Presentation]
Parody Detection Based on Alignment Collapse Between Lyrics and Singing Voice SP2023-10 |
Tomoki Ariga, Yosuke Higuchi (Waseda Univ.), Mitsunori Kanno, Rie Shigyo, Takato Mizuguchi, Naoki Okamoto (DAIICHIKOSHO), Tetsuji Ogawa (Waseda Univ.) |
(30) SP |
13:50-16:10 |
[Poster Presentation]
Generation of colored subtitle images based on emotional information of speech utterances SP2023-11 |
Fumiya Nakamura (Kobe Univ.), Ryo Aihara (Mitsubishi Electric), Ryoichi Takashima, Tetsuya Takiguchi (Kobe Univ.), Yusuke Itani (Mitsubishi Electric) |
(31) SP |
13:50-16:10 |
Streaming End-to-End speech recognition using a CTC decoder with substituted linguistic information SP2023-12 |
Tatsunari Takagi (TUT), Atsunori Ogawa (NTT), Norihide Kitaoka, Yukoh Wakabayashi (TUT) |
(32) SP |
13:50-16:10 |
[Poster Presentation]
The effect of acoustic and linguistic information on the evaluation of one's own recorded speech SP2023-13 |
Hidekazu Nagamura, Seita Tomioka, Taichirou Tanaka, Kohta I. Kobayasi (Doshisha Univ.) |
|
16:10-16:30 |
Break ( 20 min. ) |
Fri, Jun 23 PM 16:30 - 17:40 |
(33) |
16:30-16:40 |
|
(34) |
16:40-17:40 |
|
Sat, Jun 24 AM 09:30 - 12:00 |
(35) |
09:30-10:40 |
|
|
10:40-10:50 |
Break ( 10 min. ) |
(36) |
10:50-12:00 |
|
|
12:00-12:10 |
Break ( 10 min. ) |
Sat, Jun 24 PM 12:10 - 12:50 |
|
- |
|
|
12:50-13:50 |
Break ( 60 min. ) |
Sat, Jun 24 PM 13:50 - 16:10 |
(37) |
13:50-16:10 |
|
(38) |
13:50-16:10 |
|
(39) |
13:50-16:10 |
|
(40) |
13:50-16:10 |
|
(41) |
13:50-16:10 |
|
(42) |
13:50-16:10 |
|
(43) |
13:50-16:10 |
|
(44) |
13:50-16:10 |
|
(45) |
13:50-16:10 |
|
(46) |
13:50-16:10 |
|
(47) |
13:50-16:10 |
|
(48) |
13:50-16:10 |
|
(49) |
13:50-16:10 |
|
(50) |
13:50-16:10 |
|
(51) |
13:50-16:10 |
|
(52) |
13:50-16:10 |
|
(53) SP |
13:50-16:10 |
[Poster Presentation]
Study on Fundamental Frequency Estimation Method with Robust to Noise or Reverberation SP2023-14 |
Takuma Yamakawa, Kenichiro Miwa (Salesian Polytechnic) |
(54) SP |
13:50-16:10 |
Fast Neural Waveform Generation Model With Fully Connected Upsampling SP2023-15 |
Haruki Yamashita (Kobe cniv/NICT), Takuma Okamoto (NICT), Ryoichi Takashima (Kobe Univ), Yamato Ohtani (NICT), Tetsuya Takiguchi (Kobe Univ), Tomoki Toda (Nagoya Univ/NICT), Hisashi Kawai (NICT) |
(55) SP |
13:50-16:10 |
Dilation of Time-Frequency Mask and Phase Restoration with Phase Difference Constraint for Dichotic Pitch Improvement SP2023-16 |
Daiki Sugawara, Taishi Nakashima, Natsuki Ueno, Nobutaka Ono (TMU) |
(56) SP |
13:50-16:10 |
[Poster Presentation]
Development of gamification based vocal therapy support system. SP2023-17 |
Taketo Murai, Tatsuya Kitamura (Konan Univ.), Naoko Kawamura (Himeji Dokkyo Univ.) |
(57) SP |
13:50-16:10 |
[Short Paper]
SBERT-based Musical Components Estimation from Lyrics Trained with Imbalanced "Orpheus" Data SP2023-18 |
Mastuti Puspitasari, Takuya Takahashi (UEC), Gen Hori (AU), Shigeki Sagayama, Toru Nakashika (UEC) |
(58) SP |
13:50-16:10 |
Domain adaptation of speech recognition models based on self-supervised learning using target domain speech SP2023-19 |
Takahiro Kinouchi (TUT), Atsunori Ogawa (NTT), Yuko Wakabayashi, Norihide Kitaoka (TUT) |
(59) SP |
13:50-16:10 |
Non-chord Tone Data Collection for Music Analysis and Generation SP2023-20 |
Takuya Takahashi, , Toru Nakashika, Shigeki Sagayama (UEC) |
(60) SP |
13:50-16:10 |
[Poster Presentation]
Freezing response to distress calls and heart rate variability analysis in Japanese house bats SP2023-21 |
Kazuki Yoshino-Hashizawa (Doshisha Univ./JSPS), Yuna Nishiuchi, Midori Hiragochi, Motoki Kihara, Kohta I Kobayasi, Shizuko Hiryu (Doshisha Univ.) |
(61) SP |
13:50-16:10 |
Automatic speech recognition model simultaneously recognizes linguistic information and verbal/non-verbal phenomena SP2023-22 |
Nagito Shione, Yukoh Wakabayashi, Norihide Kitaoka (TUT) |
(62) SP |
13:50-16:10 |
Effect of pause length ratio in speech length on the perception of speech rate induced by speech length SP2023-23 |
Maho Tamakawa, Shuichi Sakamoto (Tohoku Univ.) |
(63) SP |
13:50-16:10 |
Environmental Sound Separation Considering Separation Distortion and Remixing Error SP2023-24 |
Kanta Shimonishi, Takahiro Fukumori, Yoichi Yamashita (Ritsumeikan Univ.) |
(64) SP |
13:50-16:10 |
Evaluation of multi-speaker text-to-speech synthesis using a corpus for speech recognition with x-vectors for various speech styles SP2023-25 |
Koki Hida (Wakayama Univ/NICT), Takuma Okamoto (NICT), Ryuichi Nisimura (Wakayama Univ), Yamato Ohtani (NICT), Tomoki Toda (Nagoya Univ/NICT), Hisashi Kawai (NICT) |
(65) SP |
13:50-16:10 |
SP2023-26 |
Kazuki Tokeshi, Toshie Matsui (Toyohashi UT) |
|
16:10-16:30 |
Break ( 20 min. ) |
Sat, Jun 24 PM 16:30 - 17:40 |
(66) |
16:30-17:40 |
|
Sat, Jun 24 PM 17:40 - 18:00 |
|
- |
|
Contact Address and Latest Schedule Information |
SP |
Technical Committee on Speech (SP) [Latest Schedule]
|
Contact Address |
|
IPSJ-MUS |
Special Interest Group on Music and Computer (IPSJ-MUS) [Latest Schedule]
|
Contact Address |
|
IPSJ-SLP |
Special Interest Group on Spoken Language Processing (IPSJ-SLP) [Latest Schedule]
|
Contact Address |
|
Last modified: 2023-06-13 17:46:38
|