Abstract
We present an unsupervised 3D deep learning framework based on a
ubiquitously true proposition named by us view-object consistency as it states that
a 3D object and its projected 2D views always belong to the same object class.
To validate its effectiveness, we design a multi-view CNN instantiating it for the
salient view selection of 3D objects, which quintessentially cannot be handled
by supervised learning due to the difficulty of collecting sufficient and consistent
training data. Our unsupervised multi-view CNN branches off two channels
which encode the knowledge within each 2D view and the 3D object respectively
and also exploits both intra-view and inter-view knowledge of the object. It ends
with a new loss layer which formulates the view-object consistency by impelling
the two channels to generate consistent classification outcomes. We evaluate our
method both qualitatively and quantitatively, demonstrating its superiority over
several state-of-the-art methods. In addition, we showcase that our method can
be used to select salient views of 3D scenes containing multiple objects which is
a more challenging and less investigated problem.
ubiquitously true proposition named by us view-object consistency as it states that
a 3D object and its projected 2D views always belong to the same object class.
To validate its effectiveness, we design a multi-view CNN instantiating it for the
salient view selection of 3D objects, which quintessentially cannot be handled
by supervised learning due to the difficulty of collecting sufficient and consistent
training data. Our unsupervised multi-view CNN branches off two channels
which encode the knowledge within each 2D view and the 3D object respectively
and also exploits both intra-view and inter-view knowledge of the object. It ends
with a new loss layer which formulates the view-object consistency by impelling
the two channels to generate consistent classification outcomes. We evaluate our
method both qualitatively and quantitatively, demonstrating its superiority over
several state-of-the-art methods. In addition, we showcase that our method can
be used to select salient views of 3D scenes containing multiple objects which is
a more challenging and less investigated problem.
Original language | English |
---|---|
Pages (from-to) | 454-470 |
Journal | Lecture Notes in Computer Sciences (LNCS) - European Conference on Computer Vision |
Early online date | 13 Nov 2020 |
DOIs | |
Publication status | E-pub ahead of print - 13 Nov 2020 |
Event | 16TH EUROPEAN CONFERENCE ON COMPUTER VISION - Glasgow, United Kingdom Duration: 23 Aug 2020 → 28 Aug 2020 https://eccv2020.eu/ |
Keywords
- Unsupervised 3D Deep Learning
- Multi-View CNN
- View-Object Consistency
- View Selection
Research Groups
- Visual Computing Lab