CV_Depth-Estimation

深度感知

1.双眼视角 2.先验知识 3.光线阴影

Available Datasets:

Loss Functions

SSIM
For self-supervised learning:

Network Architectures

Autoencoder Structure with Skip Connections.

Challenges

Application Scenarios

Innovation Points

  • Semi/Un/Self Supervised Learning
  • Attention Mechanism
  • More Useful Loss Function
  • More Efficient Network
  • Reinforcement Learning
  • Knowledge Distill
  • Structure from motion(SfM)
  • Learning single-view 3D from registered 2D views
  • Warping-based view synthesis
  • Unsupervised/Self-supervised learning from video


SuperDepth: Self-Supervised, Super-Resolved Monocular Depth Estimation

>[Paper] [[Code]]
Conference: CVPR
Year: 2018
Institute: TRI
Author: Sudeep Pillai, Rares, Ambrus, Adrien Gaidon
#Self-supervised Learning, Monocular, Stereo Imagery

What they did:

  • Proposed a sub-pixel convolutional layer extension for depth super-resolution.
  • Introduce a differentiable flip-augmentation layer

Why they did:
To solve: high resolution monocular depth prediction.

Innovation points:

Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos

>[Paper] [Code]
Conference: CVPR
Year: 2018
Institute: Google Brain
Author: Reza Mahjourian, Martin Wicke, Anelia Angelova
#Unsupervised Learning, Monocular Video

What they did:

  • Proposed a novel unsupervised algorithm for depth and ego-motion from monocular video.
  • Take the 3D structure of the world into consideration by a 3D loss function.

Why they did:
To solve:

Innovation points:

Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints

>[Paper] [Code]
Conference: CVPR
Year: 2018
Institute: Google Brain
Author: Reza Mahjourian, Martin Wicke, Anelia Angelova
#Unsupervised Learning, Monocular Video

What they did:

  • Proposed a novel unsupervised algorithm for depth and ego-motion from monocular video.
  • Take the 3D structure of the world into consideration by a 3D loss function.

Why they did:
To solve:

Innovation points:

Single Image Depth Estimation Trained via Depth from Defocus Cues

>[Paper] [[Code]]
Conference: CVPR
Year: 2019
Institute: Facebook AI
Author: Shir Gur, Lior Wolf
#Unsupervised Learning, Defocus Cues

Terms:

  • Structure from motion(SfM) 运动中结构重建
    Estimating three-dimensional structures from two-dimensional image sequences that may be coupled with local motion signals.
    利用可能与局部运动信号耦合的二维图像序列估计三维结构

  • Point spread function(PSF)点扩散函数
    Describes the response of an imaging system to a point source or point object.
    用于描述一个图像系统对一个点源或是点目标的响应。

What they did:

  • Rely, instead of multiple view geometry, on shape from defocus.
  • Proposed a novel Point Spread Function(PSF) layer, combining the successful ASPP architecture
  • Dense connections and self-attention

$I$ — all-in-focus image; $D_o$ — depth-map; $\rho$ — camera parameters vector(the aperture $A$, the focal length $F$ and the focal depth $D_f$);
There are two network: $f$ for depth estimation and $g$ for focus rendering, and $f$ is learned.
The learned network $f$ takes $I$ as input, and output a predicted depth $\bar{D}_o$. Then input $I$, $\bar{D}_o$, $\rho$ to $g$, and output a estimated rendered focused image $\bar{J}$.
The fixed network $g$ consists of the PSF layer takes $I$, $D_o$, $\rho$ as input, and output a rendered focus image $J$.

  • Atrous Convlution
  • Atrous Spatial Pyramid Pooling(ASPP)

Why they did:
To solve:

Innovation points:

Digging Into Self-Supervised Monocular Depth Estimation

>[Paper] [Code]
Conference: ICCV
Year: 2019
Institute:
Author: Cl´ement Godard, Oisin Mac Aodha, Michael Firman, Gabriel Brostow
#self-supervised Learning, Monocular

What they did:

  • A minimum reprojection loss, designed to robustly handle occlusions.
  • A full-resolution multi-scale sampling method that reduces visual artifacts.
  • An auto-masking loss to ignore training pixels that violate camera motion assumptions.

Learning Depth from Monocular Videos using Direct Methods

>[Paper] [[Code]]
Conference: CVPR
Year: 2017
Institute:
Author: Chaoyang Wang, Jos´e Miguel Buenaposada, Rui Zhu, Simon Lucey
#unsupervised Learning, monocular

What they did:

  • Explain why scale ambiguity in current monocular methods is problematic
  • Propose a simple normalization strategy
  • Incorporation of a Direct Visual Odometry (DVO) pose predictor
  • Considered the geometric relation between camera pose and depth

Unsupervised Learning of Depth and Ego-Motion from Video

>[Paper] [Code]
Conference: CVPR
Year: 2017
Institute:
Author: Tinghui Zhou, Matthew Brown, Noah Snavely, David G. Lowe
#unsupervised Learning, monocular

What they did:

  • Using single-view depth and multi-view pose networks, with a loss based on warping nearby views to the target using the computed depth and pose.
  • Explainability mask discounts
  • Overcoming the gradient locality by using a convolutional encoder-decoder architec- ture with a small bottleneck

Unsupervised Monocular Depth Estimation with Left-Right Consistency

>[Paper] [Code]
Conference: CVPR
Year: 2017
Institute:
Author: Cl´ement Godard, Oisin Mac Aodha, Gabriel J. Brostow
#unsupervised Learning, binocular stereo footage

What they did:

  • Introduce a novel depth estimation training loss
  • Featuring an inbuilt left-right consistency check

Real-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer

>[Paper] [[Code]]
Conference: CVPR
Year: 2018
Institute:
Author: Amir Atapour-Abarghouei, Toby P. Breckon
#supervised Learning, monocular, image style transfer, domain adaptation

What they did:

  • Synthetic depth prediction - apredict depth based on high quality synthetic depth training data(supervise learning)
  • Domain adaptation via style transfer

Depth map prediction from a single image using a multi-scale deep network.

>[Paper]
Conference: NIPS
Year: 2014
Institute:
Author: David Eigen, Christian Puhrsch, Rob Fergus
#supervised Learning

What they did:

  • Using a scale-invariant error in addition to more common scale-dependent errors
  • One that first estimates the global structure of the scene, then a second that refines it using local information

Why they did:
To solve: the global scale

Innovation points:

  • Collecting dataset(indoor&outdoor) from websites, social media outlets, real estate listings, and shopping sites.

Deeper Depth Prediction with Fully Convolutional Residual Networks

>[Paper] [Code]
Conference:
Year: 2016
Institute:
Author: Iro Laina, Christian Rupprecht, Vasileios Belagiannis, Federico Tombari, Nassir Navab
#supervised Learning

What they did:

  • A fully convolutional architecture, encompassing residual learning.
  • Efficiently learn feature map up-sampling within the network, up-projection layer.
  • Reverse Huber loss.