Paper Abstract and Keywords |
Presentation |
2010-01-22 11:00
Statistical sequence-to-frame mapping techniques for voice conversion Yu Qiao, Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) CQ2009-98 PRMU2009-197 SP2009-138 MVE2009-120 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
Voice conversion, a task to transform one speaker’s voice to another’s, can be regarded as a problem to find a mapping function between voice spaces of two speakers. GMM-based statistical mapping methods [1], [2] have been widely used for voice conversion. However, the classical GMM-based techniques make use of a frame-to-frame mapping function, which largely ignores the contextual information existing over a speech sequence and usually causes over-smoothness of converted speech. It is well known that HMM yields an efficient method to model the density of a whole speech sequence and has found successes in speech recognition and synthesis. Inspired by this fact, this paper studies how to use HMM for voice conversion. We derive an HMM-based sequence-to-frame mapping function with statistical analysis. Different from previous HMM-based voice conversion methods [3]~[5] that used forced alignment for segmentation and transform frames aligned to a state with its associated linear transformation, our method has a soft mapping function as a weighted summation of linear transformations. The weights are calculated as the HMM posterior probabilities of frames. We also propose and compare two methods to learn the parameters of our mapping functions, namely least square error estimation and maximum likelihood estimation. We carried out experiments to examine the proposed HMM-based method for voice conversion. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
Voice conversion / linear regression / sequence-to-frame mapping / HMM / / / / |
Reference Info. |
IEICE Tech. Rep., vol. 109, no. 375, SP2009-138, pp. 285-290, Jan. 2010. |
Paper # |
SP2009-138 |
Date of Issue |
2010-01-14 (CQ, PRMU, SP, MVE) |
ISSN |
Print edition: ISSN 0913-5685 Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
CQ2009-98 PRMU2009-197 SP2009-138 MVE2009-120 |
Conference Information |
Committee |
PRMU SP MVE CQ |
Conference Date |
2010-01-21 - 2010-01-22 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
Kyoto Univ. |
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
|
Paper Information |
Registration To |
SP |
Conference Code |
2010-01-PRMU-SP-MVE-CQ |
Language |
English (Japanese title is available) |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Statistical sequence-to-frame mapping techniques for voice conversion |
Sub Title (in English) |
|
Keyword(1) |
Voice conversion |
Keyword(2) |
linear regression |
Keyword(3) |
sequence-to-frame mapping |
Keyword(4) |
HMM |
Keyword(5) |
|
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Yu Qiao |
1st Author's Affiliation |
The University of Tokyo (Univ. of Tokyo) |
2nd Author's Name |
Daisuke Saito |
2nd Author's Affiliation |
The University of Tokyo (Univ. of Tokyo) |
3rd Author's Name |
Nobuaki Minematsu |
3rd Author's Affiliation |
The University of Tokyo (Univ. of Tokyo) |
4th Author's Name |
|
4th Author's Affiliation |
() |
5th Author's Name |
|
5th Author's Affiliation |
() |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
Speaker |
Author-1 |
Date Time |
2010-01-22 11:00:00 |
Presentation Time |
30 minutes |
Registration for |
SP |
Paper # |
CQ2009-98, PRMU2009-197, SP2009-138, MVE2009-120 |
Volume (vol) |
vol.109 |
Number (no) |
no.373(CQ), no.374(PRMU), no.375(SP), no.376(MVE) |
Page |
pp.285-290 |
#Pages |
6 |
Date of Issue |
2010-01-14 (CQ, PRMU, SP, MVE) |
|