深度感知
1.双眼视角 2.先验知识 3.光线阴影
Available Datasets:
Loss Functions
SSIM
For self-supervised learning:
Network Architectures
Autoencoder Structure with Skip Connections.
Challenges
Application Scenarios
Innovation Points
- Semi/Un/Self Supervised Learning
- Attention Mechanism
- More Useful Loss Function
- More Efficient Network
- Reinforcement Learning
- Knowledge Distill
Related Work
- Structure from motion(SfM)
- Learning single-view 3D from registered 2D views
- Warping-based view synthesis
- Unsupervised/Self-supervised learning from video
SuperDepth: Self-Supervised, Super-Resolved Monocular Depth Estimation
>
[Paper] [[Code]]Conference: CVPR
Year: 2018
Institute: TRI
Author: Sudeep Pillai, Rares, Ambrus, Adrien Gaidon
#
Self-supervised Learning, Monocular, Stereo Imagery
What they did:
- Proposed a sub-pixel convolutional layer extension for depth super-resolution.
- Introduce a differentiable flip-augmentation layer
Why they did:
To solve: high resolution monocular depth prediction.
Innovation points:
Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos
>
[Paper] [Code]Conference: CVPR
Year: 2018
Institute: Google Brain
Author: Reza Mahjourian, Martin Wicke, Anelia Angelova
#
Unsupervised Learning, Monocular Video
What they did:
- Proposed a novel unsupervised algorithm for depth and ego-motion from monocular video.
- Take the 3D structure of the world into consideration by a 3D loss function.
Why they did:
To solve:
Innovation points:
Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints
>
[Paper] [Code]Conference: CVPR
Year: 2018
Institute: Google Brain
Author: Reza Mahjourian, Martin Wicke, Anelia Angelova
#
Unsupervised Learning, Monocular Video
What they did:
- Proposed a novel unsupervised algorithm for depth and ego-motion from monocular video.
- Take the 3D structure of the world into consideration by a 3D loss function.
Why they did:
To solve:
Innovation points:
Single Image Depth Estimation Trained via Depth from Defocus Cues
>
[Paper] [[Code]]Conference: CVPR
Year: 2019
Institute: Facebook AI
Author: Shir Gur, Lior Wolf
#
Unsupervised Learning, Defocus Cues
Terms:
Structure from motion(SfM) 运动中结构重建
Estimating three-dimensional structures from two-dimensional image sequences that may be coupled with local motion signals.
利用可能与局部运动信号耦合的二维图像序列估计三维结构Point spread function(PSF)点扩散函数
Describes the response of an imaging system to a point source or point object.
用于描述一个图像系统对一个点源或是点目标的响应。
What they did:
- Rely, instead of multiple view geometry, on shape from defocus.
- Proposed a novel Point Spread Function(PSF) layer, combining the successful ASPP architecture
- Dense connections and self-attention
$I$ — all-in-focus image; $D_o$ — depth-map; $\rho$ — camera parameters vector(the aperture $A$, the focal length $F$ and the focal depth $D_f$);
There are two network: $f$ for depth estimation and $g$ for focus rendering, and $f$ is learned.
The learned network $f$ takes $I$ as input, and output a predicted depth $\bar{D}_o$. Then input $I$, $\bar{D}_o$, $\rho$ to $g$, and output a estimated rendered focused image $\bar{J}$.
The fixed network $g$ consists of the PSF layer takes $I$, $D_o$, $\rho$ as input, and output a rendered focus image $J$.
- Atrous Convlution
- Atrous Spatial Pyramid Pooling(ASPP)
Why they did:
To solve:
Innovation points:
Digging Into Self-Supervised Monocular Depth Estimation
>
[Paper] [Code]Conference: ICCV
Year: 2019
Institute:
Author: Cl´ement Godard, Oisin Mac Aodha, Michael Firman, Gabriel Brostow
#
self-supervised Learning, Monocular
What they did:
- A minimum reprojection loss, designed to robustly handle occlusions.
- A full-resolution multi-scale sampling method that reduces visual artifacts.
- An auto-masking loss to ignore training pixels that violate camera motion assumptions.
Learning Depth from Monocular Videos using Direct Methods
>
[Paper] [[Code]]Conference: CVPR
Year: 2017
Institute:
Author: Chaoyang Wang, Jos´e Miguel Buenaposada, Rui Zhu, Simon Lucey
#
unsupervised Learning, monocular
What they did:
- Explain why scale ambiguity in current monocular methods is problematic
- Propose a simple normalization strategy
- Incorporation of a Direct Visual Odometry (DVO) pose predictor
- Considered the geometric relation between camera pose and depth
Unsupervised Learning of Depth and Ego-Motion from Video
>
[Paper] [Code]Conference: CVPR
Year: 2017
Institute:
Author: Tinghui Zhou, Matthew Brown, Noah Snavely, David G. Lowe
#
unsupervised Learning, monocular
What they did:
- Using single-view depth and multi-view pose networks, with a loss based on warping nearby views to the target using the computed depth and pose.
- Explainability mask discounts
- Overcoming the gradient locality by using a convolutional encoder-decoder architec- ture with a small bottleneck
Unsupervised Monocular Depth Estimation with Left-Right Consistency
>
[Paper] [Code]Conference: CVPR
Year: 2017
Institute:
Author: Cl´ement Godard, Oisin Mac Aodha, Gabriel J. Brostow
#
unsupervised Learning, binocular stereo footage
What they did:
- Introduce a novel depth estimation training loss
- Featuring an inbuilt left-right consistency check
Real-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer
>
[Paper] [[Code]]Conference: CVPR
Year: 2018
Institute:
Author: Amir Atapour-Abarghouei, Toby P. Breckon
#
supervised Learning, monocular, image style transfer, domain adaptation
What they did:
- Synthetic depth prediction - apredict depth based on high quality synthetic depth training data(supervise learning)
- Domain adaptation via style transfer
Depth map prediction from a single image using a multi-scale deep network.
>
[Paper]
Conference: NIPS
Year: 2014
Institute:
Author: David Eigen, Christian Puhrsch, Rob Fergus
#
supervised Learning
What they did:
- Using a scale-invariant error in addition to more common scale-dependent errors
- One that first estimates the global structure of the scene, then a second that refines it using local information
Why they did:
To solve: the global scale
Innovation points:
- Collecting dataset(indoor&outdoor) from websites, social media outlets, real estate listings, and shopping sites.
Deeper Depth Prediction with Fully Convolutional Residual Networks
>
[Paper] [Code]Conference:
Year: 2016
Institute:
Author: Iro Laina, Christian Rupprecht, Vasileios Belagiannis, Federico Tombari, Nassir Navab
#
supervised Learning
What they did:
- A fully convolutional architecture, encompassing residual learning.
- Efficiently learn feature map up-sampling within the network, up-projection layer.
- Reverse Huber loss.