Paper Abstract and Keywords |
Presentation |
2024-02-20 10:15
Image Attractiveness Analysis with Explanation using Vision-Language Model Shun Yoshida, Kaede Shiohara, Toshihiko Yamasaki (UTokyo) ITS2023-61 IE2023-50 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
There has been research on making machines analyze the image attractiveness, and in recent years, further progress has been made with the development of multimodal models such as Contrastive Language-Image Pre-training (CLIP). The purpose of this study is to provide basis of judgment and linguistic explanations for the process of attractiveness analysis. Specifically, we propose a framework that incorporates explainable AI and multimodal large language model into CLIP, and show that it is possible to output more explanatory results than existing methods. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
CLIP / Multimodal Large Language Models / Image Quality Assessment / / / / / |
Reference Info. |
IEICE Tech. Rep., vol. 123, no. 381, IE2023-50, pp. 82-87, Feb. 2024. |
Paper # |
IE2023-50 |
Date of Issue |
2024-02-12 (ITS, IE) |
ISSN |
Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
ITS2023-61 IE2023-50 |
Conference Information |
Committee |
ITS IE ITE-MMS ITE-ME ITE-AIT |
Conference Date |
2024-02-19 - 2024-02-20 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
Hokkaido Univ. |
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
Image Processing, etc. |
Paper Information |
Registration To |
IE |
Conference Code |
2024-02-ITS-IE-MMS-ME-AIT |
Language |
Japanese |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Image Attractiveness Analysis with Explanation using Vision-Language Model |
Sub Title (in English) |
|
Keyword(1) |
CLIP |
Keyword(2) |
Multimodal Large Language Models |
Keyword(3) |
Image Quality Assessment |
Keyword(4) |
|
Keyword(5) |
|
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Shun Yoshida |
1st Author's Affiliation |
The University of Tokyo (UTokyo) |
2nd Author's Name |
Kaede Shiohara |
2nd Author's Affiliation |
The University of Tokyo (UTokyo) |
3rd Author's Name |
Toshihiko Yamasaki |
3rd Author's Affiliation |
The University of Tokyo (UTokyo) |
4th Author's Name |
|
4th Author's Affiliation |
() |
5th Author's Name |
|
5th Author's Affiliation |
() |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
Speaker |
Author-1 |
Date Time |
2024-02-20 10:15:00 |
Presentation Time |
15 minutes |
Registration for |
IE |
Paper # |
ITS2023-61, IE2023-50 |
Volume (vol) |
vol.123 |
Number (no) |
no.380(ITS), no.381(IE) |
Page |
pp.82-87 |
#Pages |
6 |
Date of Issue |
2024-02-12 (ITS, IE) |
|