Improving Translation Invariance in Convolutional Neural Networks with Peripheral Prediction Padding
Zero padding is often used in convolutional neural networks to prevent the feature map size from decreasing with each layer. However, recent studies have shown that zero padding promotes encoding of absolute positional information, which may adversely affect the performance of some tasks. In this work, a novel padding method called Peripheral Prediction Padding (PP-Pad) method is proposed, which enables end-to-end training of padding values suitable for each task instead of zero padding. Moreover, novel metrics to quantitatively evaluate the translation invariance of the model are presented. By evaluating with these metrics, it was confirmed that the proposed method achieved higher accuracy and translation invariance than the previous methods in a semantic segmentation task.
Implementation of PP-Pad with convolutional layers (h_p×w_p=2×3)
(1) Kensuke Mukai and Takao Yamanaka, "Improving Translation Invariance in Convolutional Neural Networks with Peripheral Prediction Padding," International Conference on Image Processing (ICIP), 2023, Kuala Lumpur, Malaysia. arXiv
mproving NeRF with Height Data for Utilization of GIS Data
Neural Radiance Fields (NeRF) has been applied to various tasks related to representations of 3D scenes. Most studies based on NeRF have focused on a small object, while a few studies have tried to reconstruct large-scale scenes although these methods tend to require large computational cost. For the application of NeRF to large-scale scenes, a method based on NeRF is proposed in this paper to effectively use height data which can be obtained from GIS (Geographic Information System). For this purpose, the scene space was divided into multiple objects and a background using the height data to represent them with separate neural networks. In addition, an adaptive sampling method is also proposed by using the height data. As a result, the accuracy of image rendering was improved with faster training speed.
(4) 中田敦也, 山中高夫, MLPMixerを用いた全天球画像生成, 第25回画像の認識・理解シンポジウム, 姫路, July 26-28, 2022.
(5) Atsuya Nakata, Ryuto Miyazaki, and Takao Yamanaka, "Increasing diversity of omni-directional images generated from single image using cGAN based on MLPMixer," Asian Conference on Pattern Recognition (ACPR), 2023, Kitakyusyu, Japan.
(1) T. Oyama and T. Yamanaka, Influence of Image Classification Accuracy on Saliency Map Estimation, CAAI Transactions on Intelligence Technology, vol. 3, issue 3, 2018, pp. 140-152. arXiv | Models of DenseSal and DPNSal in MIT Saliency Benchmark
(2) T. Oyama and T. Yamanaka, Fully Convolutional DenseNet for Saliency-Map Prediction, ACPR2017. (Best Student Paper Award)
Saliency-map Estimation for Omni-Directional Images