TY - GEN
T1 - Unsupervised Multi-view CNN for Salient View Selection of 3D Objects and Scenes
AU - Song, Ran
AU - Zhang, Wei
AU - Zhao, Yitian
AU - Liu, Yonghuai
N1 - Funding Information:
Acknowledgements. We acknowledge the support of the Young Taishan Scholars Program of Shandong Province tsqn20190929 and the Qilu Young Scholars Program of Shandong University 31400082063101, the National Natural Science Foundation of China under Grant 61991411 and U1913204, the National Key Research and Development Plan of China under Grant 2017YFB1300205, and the Shandong Major Scientific and Technological Innovation Project 2018CXGC1503.
Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020/11/13
Y1 - 2020/11/13
N2 - We present an unsupervised 3D deep learning framework based on a ubiquitously true proposition named view-object consistency as it states that a 3D object and its projected 2D views always belong to the same object class. To validate its effectiveness, we design a multi-view CNN for the salient view selection of 3D objects, which quintessentially cannot be handled by supervised learning due to the difficulty of data collection. Our unsupervised multi-view CNN branches off two channels which encode the knowledge within each 2D view and the 3D object respectively and also exploits both intra-view and inter-view knowledge of the object. It ends with a new loss layer which formulates the view-object consistency by impelling the two channels to generate consistent classification outcomes. We experimentally demonstrate the superiority of our method over state-of-the-art methods and showcase that it can be used to select salient views of 3D scenes containing multiple objects.
AB - We present an unsupervised 3D deep learning framework based on a ubiquitously true proposition named view-object consistency as it states that a 3D object and its projected 2D views always belong to the same object class. To validate its effectiveness, we design a multi-view CNN for the salient view selection of 3D objects, which quintessentially cannot be handled by supervised learning due to the difficulty of data collection. Our unsupervised multi-view CNN branches off two channels which encode the knowledge within each 2D view and the 3D object respectively and also exploits both intra-view and inter-view knowledge of the object. It ends with a new loss layer which formulates the view-object consistency by impelling the two channels to generate consistent classification outcomes. We experimentally demonstrate the superiority of our method over state-of-the-art methods and showcase that it can be used to select salient views of 3D scenes containing multiple objects.
KW - Multi-view CNN
KW - Unsupervised 3D deep learning
KW - View selection
KW - View-object consistency
UR - http://www.scopus.com/inward/record.url?scp=85097258928&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85097258928&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-58529-7_27
DO - 10.1007/978-3-030-58529-7_27
M3 - Conference proceeding (ISBN)
AN - SCOPUS:85097258928
SN - 9783030585280
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 454
EP - 470
BT - Computer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings
A2 - Vedaldi, Andrea
A2 - Bischof, Horst
A2 - Brox, Thomas
A2 - Frahm, Jan-Michael
PB - Springer Science and Business Media Deutschland GmbH
T2 - 16th European Conference on Computer Vision, ECCV 2020
Y2 - 23 August 2020 through 28 August 2020
ER -