Supervised deep learning for computer vision often involves large amounts of paired images during network training. This talk proposes virtual 3D environments, such as rendering systems and game engines like OpenGL, Unity Engine, and Unreal Engine as alternative means for gathering paired images. The major advantage of this is first, paired data is guaranteed to be clean. For example, the images can be photorealistic and pixel-perfect in nature due to how it is being rendered on-screen using well-explored theories of physically-based rendering. Thus, there is often no need to pre-process and clean them. Second, gathering data become far more straightforward than manually gathering them from the real world.
The talk will focus on one specific computer vision application – real-world dehazing – and demonstrate how synthetic images from a 3D rendering system can train a dehazing network and achieve excellent results. A corresponding peer-reviewed research article supports the talk. Techniques on how to make synthetic images compatible with real-world images are going to be discussed.