Paper Abstract and Keywords |
Presentation |
2023-03-03 09:10
Study on Analysis of Amplitude and Frequency Perturbation in the Voice for Fake Audio Detection Kai Li, Yao Wang, Minh Le Nguyen, Masato Akagi, Masashi Unoki (JAIST) EMM2022-88 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
Fake audio detection (FAD) aims to detect fake speech generated by advanced voice conversion and text-to-speech technologies. Recently, the quality of synthesized speech has significantly improved due to the remarkable development of deep neural networks. However, it is still easy for humans to identify fake speech by perceiving pathological prosody in a voice. Pathological prosody is significantly related to the amplitude and frequency perturbation (AFP) in the voice and provides essential cues to identify fake speech. This paper proposed to analyze AFP differences in the voice using the jitter and shimmer features. According to the statistical analysis of AFP features, the continuous-shimmer feature (CS3) can effectively separate genuine and fake speech signals. Moreover, static and dynamic CS3 features were combined with a light convolutional neural network bidirectional long short-term memory (LCNN-BLSTM)-based FAD system, and experiments on datasets of the Audio Deep Synthesis Detection Challenge (ADD2022) were carried out. The results of the experiments show that both the static and dynamic shimmer features of voice can provide complementary knowledge to the traditional spectrum-based FAD systems. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
fake audio detection / prosodic feature / amplitude and frequency perturbation / jitter and shimmer / / / / |
Reference Info. |
IEICE Tech. Rep., vol. 122, no. 412, EMM2022-88, pp. 110-115, March 2023. |
Paper # |
EMM2022-88 |
Date of Issue |
2023-02-23 (EMM) |
ISSN |
Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
EMM2022-88 |
Conference Information |
Committee |
EMM |
Conference Date |
2023-03-02 - 2023-03-03 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
Fukue culture hall |
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
|
Paper Information |
Registration To |
EMM |
Conference Code |
2023-03-EMM |
Language |
English |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Study on Analysis of Amplitude and Frequency Perturbation in the Voice for Fake Audio Detection |
Sub Title (in English) |
|
Keyword(1) |
fake audio detection |
Keyword(2) |
prosodic feature |
Keyword(3) |
amplitude and frequency perturbation |
Keyword(4) |
jitter and shimmer |
Keyword(5) |
|
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Kai Li |
1st Author's Affiliation |
Japan Advanced Institute of Science and Technology (JAIST) |
2nd Author's Name |
Yao Wang |
2nd Author's Affiliation |
Japan Advanced Institute of Science and Technology (JAIST) |
3rd Author's Name |
Minh Le Nguyen |
3rd Author's Affiliation |
Japan Advanced Institute of Science and Technology (JAIST) |
4th Author's Name |
Masato Akagi |
4th Author's Affiliation |
Japan Advanced Institute of Science and Technology (JAIST) |
5th Author's Name |
Masashi Unoki |
5th Author's Affiliation |
Japan Advanced Institute of Science and Technology (JAIST) |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
21st Author's Name |
|
21st Author's Affiliation |
() |
22nd Author's Name |
|
22nd Author's Affiliation |
() |
23rd Author's Name |
|
23rd Author's Affiliation |
() |
24th Author's Name |
|
24th Author's Affiliation |
() |
25th Author's Name |
|
25th Author's Affiliation |
() |
26th Author's Name |
/ / |
26th Author's Affiliation |
()
() |
27th Author's Name |
/ / |
27th Author's Affiliation |
()
() |
28th Author's Name |
/ / |
28th Author's Affiliation |
()
() |
29th Author's Name |
/ / |
29th Author's Affiliation |
()
() |
30th Author's Name |
/ / |
30th Author's Affiliation |
()
() |
31st Author's Name |
/ / |
31st Author's Affiliation |
()
() |
32nd Author's Name |
/ / |
32nd Author's Affiliation |
()
() |
33rd Author's Name |
/ / |
33rd Author's Affiliation |
()
() |
34th Author's Name |
/ / |
34th Author's Affiliation |
()
() |
35th Author's Name |
/ / |
35th Author's Affiliation |
()
() |
36th Author's Name |
/ / |
36th Author's Affiliation |
()
() |
Speaker |
Author-5 |
Date Time |
2023-03-03 09:10:00 |
Presentation Time |
25 minutes |
Registration for |
EMM |
Paper # |
EMM2022-88 |
Volume (vol) |
vol.122 |
Number (no) |
no.412 |
Page |
pp.110-115 |
#Pages |
6 |
Date of Issue |
2023-02-23 (EMM) |