Google 學術搜尋

Visuo-Tactile Zero-Shot Object Recognition with Vision-Language Model

S Ueda, A Hashimoto, M Hamaya, K Tanaka… - arXiv preprint arXiv …, 2024 - arxiv.org

S Ueda, A Hashimoto, M Hamaya, K Tanaka, H Saito

arXiv preprint arXiv:2409.09276, 2024•arxiv.org

Tactile perception is vital, especially when distinguishing visually similar objects. We
propose an approach to incorporate tactile data into a Vision-Language Model (VLM) for
visuo-tactile zero-shot object recognition. Our approach leverages the zero-shot capability of
VLMs to infer tactile properties from the names of tactilely similar objects. The proposed
method translates tactile data into a textual description solely by annotating object names for
each tactile sequence during training, making it adaptable to various contexts with low …

Tactile perception is vital, especially when distinguishing visually similar objects. We propose an approach to incorporate tactile data into a Vision-Language Model (VLM) for visuo-tactile zero-shot object recognition. Our approach leverages the zero-shot capability of VLMs to infer tactile properties from the names of tactilely similar objects. The proposed method translates tactile data into a textual description solely by annotating object names for each tactile sequence during training, making it adaptable to various contexts with low training costs. The proposed method was evaluated on the FoodReplica and Cube datasets, demonstrating its effectiveness in recognizing objects that are difficult to distinguish by vision alone.

arxiv.org

顯示更多顯示較少

儲存引用相關文章全部共 2 個版本 HTML 版

引用

進階搜尋

已儲存至「我的圖書館」

Visuo-Tactile Zero-Shot Object Recognition with Vision-Language Model