
This study proposes an adaptive view planning approach to boost the efficiency and robustness of active object detection. Existing approaches usually make strong assumptions about the task and environment, thus are less robust and efficient. Keywords: Deep Learning for Visual Perception, Reinforcement Learning, RecognitionĪbstract: Active vision is a desirable perceptual feature for robots. Institute for Infocomm Research, Agency for Science, Technology Towards Efficient Multiview Object Detection with Adaptive Action Prediction Experiments demonstrate that our method outperforms state-of-the-art techniques on city-scale VPR benchmark datasets. (2) By exploiting the interpretability of the local weighting scheme, a semantic constrained initialization is proposed so that the local attention can be reinforced by semantic priors. (1) To suppress misleading local features, an interpretable local weighting scheme is proposed based on hierarchical feature distribution. To fill the gap between the two types, we propose a novel Semantic Reinforced Attention Learning Network (SRALNet), in which the inferred attention can benefit from both semantic priors and data-driven fine-tuning. In order to highlight the task-relevant visual cues in the feature embedding, the existing attention mechanisms are either based on artificial rules or trained in a thorough data-driven manner. Keywords: Deep Learning for Visual Perception, Recognition, LocalizationĪbstract: Large-scale visual place recognition (VPR) is inherently challenging because not all visual cues in the image are beneficial to the task. Semantic Reinforced Attention Learning for Visual Place Recognition Experiments on the KITTI dataset demonstrate that our proposed VIC-Net outperforms other one-stage methods and our two-stage method VIC-RCNN achieves new state-of-the-art performance. In addition, we extend VIC-Net to a two-stage approach, namely VIC-RCNN, which further utilizes the fine geometric features to refine object locations. Thirdly, an auxiliary reconstruction loss is employed on point branch to explicitly guide the point backbone to be aware of real geometry structures. The P2V modules respectively integrate local detail features and multi-scale semantic contexts into sparse voxel backbone. Then based on the encoded point features, two Point2Voxel (P2V) feature fusion modules are proposed to fuse point features with voxel backbone, including Local P2V and Multi-Scale P2V. Firstly, PointNet++ is adopted to efficiently encode geometry structure features from the raw point clouds. The whole framework consists of a point branch for geometric detail extraction and a voxel branch for efficient proposals generation. To address this problem, we propose a novel one-stage Voxelization Information Compensation Network (VIC-Net), which has the ability of loss-free feature extraction. These methods usually transform points into voxels while suffering from information loss during point cloud voxelization. Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, RecognitionĪbstract: Voxel-based methods have been widely used in point cloud 3D object detection. VIC-Net: Voxelization Information Compensation Network for Point Cloud 3D Object Detection Codes related to this paper are publicly available at. To demonstrate the effectiveness of the proposed framework we use VERI-Wild, VRIC and Veri-776 datasets that exhibit extreme intra-class and minute inter-class differences and achieve state-of-the-art (SoTA) performance. In this paper, we emphasize upon enhancing the performance of visual feature based ReID system by improving feature embedding quality and propose (1) an attention-guided hierarchical feature extractor (HFE) that leverages the structure of a backbone CNN to extract coarse and fine-grained features and (2) to train the proposed network within a hard negative adversarial framework that generates samples exhibiting extreme variations, encouraging the network to extract important distinguishing features across varying scales. However viewpoint, illumination, and occlusion variations along with subtle differences between two unique images pose a significant challenge towards achieving an effective system. Keywords: Intelligent Transportation Systems, AI-Based Methods, Computer Vision for TransportationĪbstract: Vehicle Re-identification (Re-ID) aims to retrieve all instances of query vehicle images present in an image pool.
