IEICE Technical Committee Submission System
Conference Paper's Information
Online Proceedings
[Sign in]
Tech. Rep. Archives
 Go Top Page Go Previous   [Japanese] / [English] 

Paper Abstract and Keywords
Presentation 2016-12-20 16:40
Generative Adversarial Network-based Postfiltering for Statistical Parametric Speech Synthesis
Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo, Yusuke Ijima, Kaoru Hiramatsu, Kunio Kashino (NTT) SP2016-61
Abstract (in Japanese) (See Japanese page) 
(in English) In the field of speech synthesis, statistical parametric speech synthesis has been widely used due to the flexibility and compactness. However, the quality of its synthesized speech is degraded by over-smoothing and there is a large quality gap between natural and synthesized speech. To fill the gap, we propose a novel postfilter based on a generative adversarial network (GAN). There have been several attempts to alleviate over-smoothing like ours; however, they are based on empirical findings about acoustic differences between natural and synthesized speech. Therefore, they cannot cover all the factors causing the differences. In contrast, we examine a learning-based postfilter and learn how to compensate for the differences directly from the data. In particular, we utilize a GAN and optimize a generator (i.e., postfilter) and a discriminator in an adversarial process. This enables us to obtain the postfilter to fit the true data distribution. Experimental results show that the speech generated by our proposed method is comparable to analyzed-and-synthesized speech.
Keyword (in Japanese) (See Japanese page) 
(in English) statistical parametric speech synthesis / postfilter / deep neural network / generative adversarial network / / / /  
Reference Info. IEICE Tech. Rep., vol. 116, no. 378, SP2016-61, pp. 89-94, Dec. 2016.
Paper # SP2016-61 
Date of Issue 2016-12-13 (SP) 
ISSN Print edition: ISSN 0913-5685    Online edition: ISSN 2432-6380
Copyright
and
reproduction
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034)
Notes on Review This article is a technical report without peer review, and its polished version will be published elsewhere.
Download PDF SP2016-61

Conference Information
Committee SP IPSJ-SLP NLC IPSJ-NL  
Conference Date 2016-12-20 - 2016-12-22 
Place (in Japanese) (See Japanese page) 
Place (in English) NTT Musashino R&D 
Topics (in Japanese) (See Japanese page) 
Topics (in English) The 18th Spoken Language Symposium & The Third Natural Language Processing Symposium 
Paper Information
Registration To SP 
Conference Code 2016-12-SP-SLP-NLC-NL 
Language Japanese 
Title (in Japanese) (See Japanese page) 
Sub Title (in Japanese) (See Japanese page) 
Title (in English) Generative Adversarial Network-based Postfiltering for Statistical Parametric Speech Synthesis 
Sub Title (in English)  
Keyword(1) statistical parametric speech synthesis  
Keyword(2) postfilter  
Keyword(3) deep neural network  
Keyword(4) generative adversarial network  
Keyword(5)  
Keyword(6)  
Keyword(7)  
Keyword(8)  
1st Author's Name Takuhiro Kaneko  
1st Author's Affiliation Nippon Telegraph and Telephone Corporation (NTT)
2nd Author's Name Hirokazu Kameoka  
2nd Author's Affiliation Nippon Telegraph and Telephone Corporation (NTT)
3rd Author's Name Nobukatsu Hojo  
3rd Author's Affiliation Nippon Telegraph and Telephone Corporation (NTT)
4th Author's Name Yusuke Ijima  
4th Author's Affiliation Nippon Telegraph and Telephone Corporation (NTT)
5th Author's Name Kaoru Hiramatsu  
5th Author's Affiliation Nippon Telegraph and Telephone Corporation (NTT)
6th Author's Name Kunio Kashino  
6th Author's Affiliation Nippon Telegraph and Telephone Corporation (NTT)
7th Author's Name  
7th Author's Affiliation ()
8th Author's Name  
8th Author's Affiliation ()
9th Author's Name  
9th Author's Affiliation ()
10th Author's Name  
10th Author's Affiliation ()
11th Author's Name  
11th Author's Affiliation ()
12th Author's Name  
12th Author's Affiliation ()
13th Author's Name  
13th Author's Affiliation ()
14th Author's Name  
14th Author's Affiliation ()
15th Author's Name  
15th Author's Affiliation ()
16th Author's Name  
16th Author's Affiliation ()
17th Author's Name  
17th Author's Affiliation ()
18th Author's Name  
18th Author's Affiliation ()
19th Author's Name  
19th Author's Affiliation ()
20th Author's Name  
20th Author's Affiliation ()
Speaker Author-1 
Date Time 2016-12-20 16:40:00 
Presentation Time 25 minutes 
Registration for SP 
Paper # SP2016-61 
Volume (vol) vol.116 
Number (no) no.378 
Page pp.89-94 
#Pages
Date of Issue 2016-12-13 (SP) 


[Return to Top Page]

[Return to IEICE Web Page]


The Institute of Electronics, Information and Communication Engineers (IEICE), Japan