Paper Abstract and Keywords |
Presentation |
2019-12-06 16:00
A comparison of neural vocoders in singing voice synthesis Sota Wada, Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2019-42 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
In this study, we compare five types of vocoders based on neural networks (neural vocoders) for singing voice synthesis. In recent years, WaveNet vocoder has been proposed as a neural vocoder. WaveNet vocoder can model speech waveforms with high accuracy and generate natural sounding speech. However there is a problem that WaveNet vocoder cannot synthesize speech in real time due to its autoregressive structure. To address this problem, two approaches have been proposed. The first approach is to reduce the model structure of the autoregressive models. This increases the efficiency of sampling from the models and allows faster synthesis than real time. The second approach is to synthesize multiple samples simultaneously by using flow-based generative models.The performance of these methods has been investigated using normal utterances, and no singing voice has been used yet. Therefore, in this paper, we compare the performance of five types of neural vocoders for singing voice synthesis. The results of subjective and objective evaluation experiments show that WaveRNN is an appropriate neural vocoder when emphasizing naturalness, and WaveNet is appropriate if emphasizing reproducibility of pitch and vibrato. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
DNN / Singing voice synthesis / Neural vocoder / WaveNet / / / / |
Reference Info. |
IEICE Tech. Rep., vol. 119, no. 321, SP2019-42, pp. 85-90, Dec. 2019. |
Paper # |
SP2019-42 |
Date of Issue |
2019-11-29 (SP) |
ISSN |
Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Notes on Review |
This article is a technical report without peer review, and its polished version will be published elsewhere. |
Download PDF |
SP2019-42 |
|