Paper Abstract and Keywords |
Presentation |
2023-05-19 15:25
Prompt Learning for Object Detection with Vision-Language Model Mariko Tomariguchi (OKI) PRMU2023-12 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
The two-stage object detection models crop features in the regions where objects are most likely to be to classify the objects. In this work, we investigate the influence of the surrounding information on the objects on classifying objects and improve the prompt learning method for object detection using Vision-Language models. We learn the learnable vectors correspond to input prompts to CLIP with augmented data to create prompts with and without surroundings information. Then, we train the object detection model substituting the calculation of the classification score for the language embedding obtained from passing the learned prompts through the CLIP language encoder. Our method achieves 20.3 %$mathrm{AP}$ on the LVIS dataset with prompts including surroundings, and 21.6 %$mathrm{AP}$ with prompts not including surroundings. In particular, 27.9 % mathrm{AP}_f$ and 29.1 % $mathrm{AP}_f$ are achieved in the LVIS frequency class, respectively. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
deep learning / object detection / Mask R-CNN / prompt learning / CLIP / / / |
Reference Info. |
IEICE Tech. Rep., vol. 123, no. 30, PRMU2023-12, pp. 62-67, May 2023. |
Paper # |
PRMU2023-12 |
Date of Issue |
2023-05-11 (PRMU) |
ISSN |
Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
PRMU2023-12 |
Conference Information |
Committee |
PRMU IPSJ-CVIM |
Conference Date |
2023-05-18 - 2023-05-19 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
|
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
|
Paper Information |
Registration To |
PRMU |
Conference Code |
2023-05-PRMU-CVIM |
Language |
Japanese |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Prompt Learning for Object Detection with Vision-Language Model |
Sub Title (in English) |
|
Keyword(1) |
deep learning |
Keyword(2) |
object detection |
Keyword(3) |
Mask R-CNN |
Keyword(4) |
prompt learning |
Keyword(5) |
CLIP |
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Mariko Tomariguchi |
1st Author's Affiliation |
Oki Electric Industry Co., Ltd. (OKI) |
2nd Author's Name |
|
2nd Author's Affiliation |
() |
3rd Author's Name |
|
3rd Author's Affiliation |
() |
4th Author's Name |
|
4th Author's Affiliation |
() |
5th Author's Name |
|
5th Author's Affiliation |
() |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
Speaker |
Author-1 |
Date Time |
2023-05-19 15:25:00 |
Presentation Time |
15 minutes |
Registration for |
PRMU |
Paper # |
PRMU2023-12 |
Volume (vol) |
vol.123 |
Number (no) |
no.30 |
Page |
pp.62-67 |
#Pages |
6 |
Date of Issue |
2023-05-11 (PRMU) |
|