Abstract
This paper presents a new GAN-based deep learning framework for estimating absolute scale aware depth and ego motion from monocular images using a completely unsupervised mode of learning. The proposed architecture uses two separate generators to learn the distribution of depth and pose
data for a given input image sequence. The depth and pose data, thus generated, are then evaluated by
a patch-based discriminator using the reconstructed
image and its corresponding actual image. The
patch-based GAN (or PatchGAN) is shown to detect high frequency local structural defects in the reconstructed image, thereby improving the accuracy
of overall depth and pose estimation. Unlike conventional GANs, the proposed architecture uses a
conditioned version of input and output of the generator for training the whole network. The resulting
framework is shown to outperform all existing deep
networks in this field, beating the current state-of-the-art method by 8.7% in absolute error and 5.2% in RMSE metric. To the best of our knowledge,
this is first deep network based model to estimate
both depth and pose simultaneously using a conditional patch-based GAN paradigm. The efficacy of
the proposed approach is demonstrated through rigorous ablation studies and exhaustive performance
comparison on the popular KITTI outdoor driving
dataset.
data for a given input image sequence. The depth and pose data, thus generated, are then evaluated by
a patch-based discriminator using the reconstructed
image and its corresponding actual image. The
patch-based GAN (or PatchGAN) is shown to detect high frequency local structural defects in the reconstructed image, thereby improving the accuracy
of overall depth and pose estimation. Unlike conventional GANs, the proposed architecture uses a
conditioned version of input and output of the generator for training the whole network. The resulting
framework is shown to outperform all existing deep
networks in this field, beating the current state-of-the-art method by 8.7% in absolute error and 5.2% in RMSE metric. To the best of our knowledge,
this is first deep network based model to estimate
both depth and pose simultaneously using a conditional patch-based GAN paradigm. The efficacy of
the proposed approach is demonstrated through rigorous ablation studies and exhaustive performance
comparison on the popular KITTI outdoor driving
dataset.
Original language | English |
---|---|
Pages | 5677 |
Number of pages | 5684 |
Publication status | Published - 16 Aug 2019 |
Event | International Joint Conference on Artificial Intelligence - Macao, Macao, China Duration: 10 Aug 2019 → 16 Aug 2019 https://www.ijcai.org/proceedings/2019/0787.pdf |
Conference
Conference | International Joint Conference on Artificial Intelligence |
---|---|
Abbreviated title | IJCAI |
Country/Territory | China |
City | Macao |
Period | 10/08/19 → 16/08/19 |
Internet address |
Keywords
- Deep learning, Depth Estimation from Images, GANs