Chapter20: Graph Neural Networks in Computer Vision
Siliang Tang, Zhejiang University, siliang@zju.edu.cn
Wenqiao Zhang, Zhejiang University, wenqiaozhang@zju.edu.cn
Zongshen Mu, Zhejiang University, zongshen@zju.edu.cn
Kai Shen, Zhejiang University, shenkai@zju.edu.cn
Juncheng Li, Zhejiang University, junchengli@zju.edu.cn
Jiacheng Li, Zhejiang University, lijiacheng@zju.edu.cn
Lingfei Wu, Pinterest, lwu@email.wm.edu
Abstract
Recently Graph Neural Networks (GNNs) have been incorporated into many Computer Vision (CV) models. They not only bring performance improvement to many CV-related tasks but also provide more explainable decomposition to these CV models. This chapter provides a comprehensive overview of how GNNs are applied to various CV tasks, ranging from single image classification to crossmedia understanding. It also provides a discussion of this rapidly growing field from a frontier perspective.
Contents
- Introduction
- Representing Visions as Graphs
- Visual Node representation
- Visual Edge representation
- Case Study 1: Image
- Object Detection
- Image Classification
- Case Study 2: Video
- Video Action Recognition
- Temporal Action Localization
- Other Related Work: Cross-media
- Vision Caption
- Visual Question Answering
- Cross-Media Retrieval
- Frontiers for GNNs on Computer Vision
- Advanced GNN Modeling Methods for Computer Vision
- Broader Area of GNNs on Computer Vision
- Summary
Citation
@incollection{GNNBook-ch20-wu,
author = "Tang, Siliang and Zhang, Wenqiao and Mu, Zongshen and Shen, Kai and Li, Juncheng and Li, Jiacheng and Wu, Lingfei",
editor = "Wu, Lingfei and Cui, Peng and Pei, Jian and Zhao, Liang",
title = "Graph Neural Networks in Computer Vision",
booktitle = "Graph Neural Networks: Foundations, Frontiers, and Applications",
year = "2022",
publisher = "Springer Singapore",
address = "Singapore",
pages = "447--462",
}
S. Tang, W. Zhang, Z. Mu, K. Shen, J. Li, J. Li, and L. Wu, “Graph neural networks in computer vision,” in Graph Neural Networks: Foundations, Frontiers, and Applications, L. Wu, P. Cui, J. Pei, and L. Zhao, Eds. Singapore: Springer Singapore, 2022, pp. 447–462.