Paper Abstract and Keywords |
Presentation |
2023-06-23 13:50
[Poster Presentation]
MS-Harmonic-Net++ vs SiFi-GAN: Comparison of fundamental frequency controllable fast neural waveform generative models. Sota Shimizu (Kobe Univ./NICT), Takuma Okamoto (NICT), Ryoichi Takashima (Kobe Univ.), Yamato Ohtani (NICT), Tetsuya Takiguchi (Kobe Univ.), Tomoki Toda (Nagoya Univ./NICT), Hisashi Kawai (NICT) SP2023-5 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
Although Harmonic-Net+ has been proposed as a fundamental frequency (fo) and speech rate (SR) controllable fast neural vocoder with WORLD features, the computational time of WORLD feature extraction itself is slow, and Harmonic-Net+ cannot realize real-time inference when including feature extraction. Then, to realize fo and SR controllable fast neural vocoder including feature extraction, Harmonic-Net++ is proposed with WORLD feature prediction network from mel-spectrogram input. Furthermore, to accelerate the inference speed of Harmonic-Net++ and MS-Harmonic-Net++, which is proposed by introducing multi-stream-based trainable fast upsampling. In this study, we compare MS-Harmonic-Net++ and SiFi-GAN,which is proposed as a high-quality and real-time inference model on CPU by improving on HiFi-GAN as well as Harmonic-Net+, which are fast neural waveform generation models with fo control. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
speech synthesis / neural vocoder / fundamental frequency control / speech rate control / real-time inference / / / |
Reference Info. |
IEICE Tech. Rep., vol. 123, no. 88, SP2023-5, pp. 20-25, June 2023. |
Paper # |
SP2023-5 |
Date of Issue |
2023-06-16 (SP) |
ISSN |
Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
SP2023-5 |
|