Two papers from RLLAB are accepted to ICCV 2019

[2019.07.23]

Following papers are accepted to the International Conference on Computer Vision (ICCV 2019):

  • Deep Elastic Networks with Model Selection for Multi-Task Learning by Chanho Ahn, Eunwoo Kim, and Songhwai Oh
    • Abstract: In this work, we consider the problem of instance-wise dynamic network model selection for multi-task learning. To this end, we propose an efficient approach to exploit a compact but accurate model in a backbone architecture for each instance of all tasks. The proposed method consists of an estimator and a selector. The estimator is based on a backbone architecture and structured hierarchically. It can produce multiple different network models of different configurations in a hierarchical structure. The selector chooses a model dynamically from a pool of candidate models given an input image. The selector is a small-size network consisting of a few layers, which estimates a probability distribution over the candidate models when an input instance of a task is given. To overcome the difficulty of learning the selector over a large discrete search space, we introduce a sampling-based learning strategy. Both estimator and selector are trained in a single framework in conjunction with the sampling scheme. We demonstrate the proposed approach for several image classification tasks compared to existing approaches performing the model selection or learning multiple tasks. Experimental results show that our approach gives not only outstanding performance compared to other competitors but also the versatility to perform instance-wise model selection for multiple tasks.
  • Unsupervised 3D Reconstruction Networks by Geonho Cha, Minsik Lee, Songhwai Oh
    • Abstract: In this paper, we propose 3D unsupervised reconstruction networks (3D-URN), a 3D structure reconstruction network which reconstructs the 3D structures of instances in a given object category from their 2D feature points. 3D-URN consists of a 3D shape reconstructor and a rotation estimator, which are trained in a fully-unsupervised manner incorporating the proposed unsupervised loss functions. The role of the 3D shape reconstructor is to reconstruct the 3D shape of an instance from its 2D feature points, and the rotation estimator infers the camera pose. After training, 3DURN can infer the 3D structure of an unseen instance in the same category, which is not possible in the conventional schemes of non-rigid structure from motion and structure from category. The experimental result shows the state-of-the-art performance, which demonstrates the effectiveness of the proposed method.