Detailed conference program

July 22 (Saturday)

Registration (16:00-17:00)

Welcome Reception (17:00-)

July 23 (Sunday)

Registration (9:00-17:45)

Opening (9:30-10:00)

IAPR Invited Talk 1 (10:00-11:00)

Chair: Kyoko Sudo (Toho University)

  1. Prof. Dima Damen (University of Bristol)
    Opportunities in Egocentric Video Understanding
Oral 1-1: Image Analysis and Perception (11:15-12:15)

Chair: Shin'ichi Satoh (National Instisute of Informatics)

  1. O1-1-1 Ching-Ching Yang (National Cheng Kung University); Wei-Ta Chu (National Cheng Kung University)*; Shiv Ram Dubey (Indian Institute of Information Technology Allahabad)
    Weakly-Supervised Deep Image Hashing based on Cross-Modal Transformer
  2. O1-1-2 Jihyun Lee (KAIST); Hangil Park (KAIST)*; Yongmin Seo (Hanyang University); Taewon Min (KAIST); Joodong Yun (Samsung Display); Jaewon Kim (Samsung Display); Tae-Kyun (T-K) Kim (KAIST/Imperial College London)
    Contrastive Knowledge Distillation for Anomaly Detection in Multi-Illumination/Focus Display Images
  3. O1-1-3 Daiki Mushiake (TTI-J); Kentaro Otomo (TTI-J); Chihiro Nakatani (TTI-J)*; Norimichi Ukita (TTI-J)
    Shape Preservation in Image Style Transfer for Gaze Estimation

Poster 1 (13:30-15:30)

  1. P1-01 Hirotaka Hachiya (Wakayama University)*; Yuto Yoshimura (Wakayama University)
    Combining Static Specular Flow and Highlight with Deep Features for Specular Surface Detection
  2. P1-02 Paola Barra (Parthenope University of Naples)*; Alessia Auriemma Citarella (University of Salerno); Giosué Orefice (Università di Napoli Pathenope); Modesto Fernando Castrillón Santana (Universidad de Las Palmas de Gran Canaria ); Angelo Ciaramella (University of Naples Parthenope)
    LOTS: Litter On The Sand for Litter Segmentation
  3. P1-03 Yeongnam Chae (Rakuten Institute of Technology)*; Poulami Raha (Rakuten Institute of Technology); Mijung Kim (Rakuten Institute of Technology); Bjorn Stenger (Rakuten Institute of Technology)
    Age Prediction From Face Images Via Contrastive Learning
  4. P1-04 Masato Tada (Yamaguchi University)*; Xian-Hua Han (Yamaguchi University)
    Bottleneck Transformer Model with Channel Self-attention for Skin Lesion Classification
  5. P1-05 Jaesung Yang (Hitachi, Ltd.)*; Daisuke Hagihara (Hitachi, Ltd.); Kiyoto Ito (Hitachi, Ltd.); Nobuhiro Chihara (Hitachi, Ltd.)
    Safe Height Estimation of Deformable Objects for Picking Robots by Detecting Multiple Potential Contact Points
  6. P1-06 Fuzhen Cai (Southeast University); Siyu Xia (Southeast University)*
    Mixed Distillation for Unsupervised Anomaly Detection
  7. P1-07 Koji Takeda (Tokyo Metropolitan Industrial Technology Research Institute)*; Kanji Tanaka (University of Fukui); Yoshimasa Nakamura (Tokyo Metropolitan Industrial Technology Research Institute)
    Lifelong Change Detection: Continuous Domain Adaptation for Small Object Change Detection in Everyday Robot Navigation
  8. P1-08 Martin Knoche (Technical University of Munich)*; Gerhard Rigoll (Technical University of Munich)
    Tackling Face Verification Edge Cases: In-Depth Analysis and Human-Machine Fusion Approach
  9. P1-09 Sho Harada (Yamaguchi University)*; Xian-Hua Han (Yamaguchi University)
    A Hybrid Wheat Head Detection Model with Incorporated CNN and Transformer
  10. P1-10 Vijay John (RIKEN)*; Yasutomo Kawanishi (RIKEN)
    Combining Knowledge Distillation and Transfer Learning for Sensor Fusion in Visible and Thermal Camera-based Person Classification
  11. P1-11 Chih-Yi Chiu (National Chiayi University)*; Yu-Hsien Chen (National Chiayi University)
    MFFPN: An Anchor-free Method for Patent Drawing Object Detection
  12. P1-12 Hiroaki Masuzawa (Toyohashi University of Technology); Chuo Nakano (Toyohashi University of Technology); Jun Miura (Toyohashi University of Technology)*
    CG-based Dataset Generation and Adversarial Image Conversion for Deep Cucumber Recognition
  13. P1-14 Banri Kojima (Nagoya University); Takahiro Komamizu (Nagoya University)*; Yasutomo Kawanishi (RIKEN); Keisuke Doman (Chukyo University); Ichiro Ide (Nagoya University)
    Image Impression Estimation by Clustering People with Similar Tastes
  14. P1-15 Kuber Reddy Gorantla (Siemens Technology)*; Aditi Roy (Siemens Corporation)
    Generalizable Solar Irradiation Prediction using Large Transformer Models with Sky Imagery
  15. P1-16 Yucheng Zhang (Panasonic Automotive Systems)*; Masaki Fukuda (Panasonic); Yasunori Ishii (Panasonic); Kyoko Oshima (Panasonic); Takayoshi Yamashita (Chubu University)
    PALF: Pre-Annotation and Camera-LiDAR Late Fusion for the Easy Annotation of Point Clouds
  16. P1-17 Aditya Abhiram Vadduri (Manipal Institue of Technology); Anagh Benjwal (Manipal Institute of Technology)*; Prajwal Uday (Manipal Institute of Technology); Abhishek V Pai (Manipal Institute of Technology)
    Safe Landing Zone Detection for UAVs using Image Segmentation and Super Resolution
  17. P1-18 Assil Jaby (Bahcesehir University); Md Baharul Islam (Bahcesehir University)*; Md Atiqur Rahman Ahad (University of East London)
    ASD-EVNet: An Ensemble Vision Network based on Facial Expression for Autism Spectrum Disorder Recognition
  18. P1-20 Chu-Chi Chiu (National Tsin-Hua University); Hsuan-Kung Yang (National Tsing Hua University); Hao-Wei Chen (National Tsing Hua University); Yu-Wen Chen (National Tsing Hua University); Chun-Yi Lee (National Tsing Hua University)*
    ViTVO: Vision Transformer Based Visual Odometry with Attention Supervision
  19. P1-21 David Freire Obregon (Univesidad de Las Palmas de Gran Canaria)*; Javier Lorenzo-Navarro (Universidad de Las Palmas de Gran Canaria); Oliverio J. Santana (University of Las Palmas de Gran Canaria); Daniel Hernandez-Sosa (University of Las Palmas de Gran Canaria); Modesto Castrillón-Santana (Universidad de Las Palmas de Gran Canaria)
    An X3D Neural Network Analysis for Runner's Performance Assessment in a Wild Sporting Environment
  20. P1-22 Yuan Li (Waseda University)*; Tingting Hu (Panasonic); Ryuji Fuchikami (Panasonic); Takeshi Ikenaga (Waseda University)
    Grid Sample Based Temporal Iteration and Compactness-coefficient Distance for High Frame and Ultra-low Delay SLIC Segmentation System
  21. P1-23 Carlos Victorino Padeiro (Nagoya University)*; Takahiro Komamizu (Nagoya University); Ichiro Ide (Nagoya University)
    Towards Achieving Lightweight Deep Neural Network for Precision Agriculture with Maize Disease Detection
  22. P1-24 Tomohiro Fujita (RIKEN)*; Yasutomo Kawanishi (RIKEN)
    Human Pose Prediction by Progressive Generation in Multi-scale Frequency Domain
  23. P1-25 G Swetha (Indian Institute of Technology Hyderabad)*; Rajeshreddy Datla (Advanced Data Processing Research Institute (ADRIN)); Chalavadi Vishnu (Indian Institute of Technology Hyderabad); Krishna Mohan C. (Indian Institute of Technology Hyderabad)
    MS-VACSNet: A Network for Multi-scale Volcanic Ash Cloud Segmentation in Remote Sensing Images

Oral 1-2: Neural Architectures and Continual Learning (15:30-16:30)

Chair: Yoichi Sato (The University of Tokyo)

  1. O1-2-1 Yutaka Yoshihama (Panasonic Automotive Systems Co., Ltd.)*; Kenichi Yadani (Panasonic Automotive Systems Co., Ltd.); Shota Isobe (Panasonic Automotive Systems Co., Ltd.)
    Hardware-Aware Zero-shot Neural Architecture Search
  2. O1-2-2 Sheng-Kai Huang (National Chung Hsing University); Chun-Rong Huang (National Cheng Kung University)*
    Transformer with Task Selection for Continual Learning
  3. O1-2-3 Chihiro Nakatani (TTI-J)*; Hiroaki Kawashima (University of Hyogo); Norimichi Ukita (TTI-J)
    Joint Learning with Group Relation and Individual Action

Oral 1-3: Computer Vision Techniques and Applications (16:45-17:45)

Chair: Chun-Rong Huang (National Cheng Kung University)

  1. O1-3-1 Haorong Jiang (Waseda University)*; Fengshan Zhao (Waseda University); Junda Liao (Nanjing University; Waseda University); Qin Liu (Nanjing University); Takeshi Ikenaga (Waseda University)
    Multi-prior based Multi-scale Condition Network for Single-image HDR Reconstruction
  2. O1-3-2 Yudai Hirose (Kagoshima University)*; Satoshi Ono (Kagoshima University)
    Black-box Adversarial Attack against Visual Interpreters for Deep Neural Networks
  3. O1-3-3 Yu-Hui Huang (Yuan Ze University)*; Marc Proesmans (KU Leuven); Luc Van Gool (KU Leuven)
    Padding Investigations for CNNs in Scene Parsing Tasks

July 24 (Monday)

Registration (9:15-18:15)

IAPR Invited Talk 2 (9:30-10:30)

Chair: Wei-Ta Chu (National Cheng Kung University)

  1. Prof. CC Jay Kuo (University of Southern California)
    On the 2nd AI Wave: Toward Interpretable, Reliable, and Sustainable AI
Technical Event (10:45-11:30)

Chair: Norimichi Ukita (Toyota Technological Institute)

  1. TE-1 Yuki Kondo (Toyota Motor Corporation)*; Norimichi Ukita (TTI-J); Takayuki Yamaguchi (Iwate Agricultural Research Center); Hao-Yu Hou (National Tsing Hua University); Mu-Yi Shen (National Tsing Hua University); Chia-Chi Hsu (National Tsing Hua University); En-Ming Huang (National Tsing Hua University); Yu-Chen Huang (National Tsing Hua University); Yu-Cheng Xia (National Tsing Hua University); Chien-Yao Wang (Institute of Information Science, Academia Sinica); Chun-Yi Lee (National Tsing Hua University); Da Huo (Nagoya University); Marc A. Kastner (Kyoto University); Tingwei Liu (Nagoya University); Yasutomo Kawanishi (RIKEN); Takatsugu Hirayama (University of Human Environments); Takahiro Komamizu (Nagoya University); Ichiro Ide (Nagoya University); Yosuke Shinya (Independent researcher); Xinyao Liu (Xi'an Jiaotong University); Guang Liang (Xi'an Jiaotong University); Syusuke Yasui (Space Shift Inc.)
    MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset, Methods, and Results
  2. TE-2 Hao-Yu Hou (National Tsing Hua University)*; Mu-Yi Shen (National Tsing Hua University); Chia-Chi Hsu (National Tsing Hua University); En-Ming Huang (National Tsing Hua University); Yu-Chen Huang (National Tsing Hua University); Yu-Cheng Xia (National Tsing Hua University); Chien-Yao Wang (Academia Sinica); Chun-Yi Lee (National Tsing Hua University);
    Ensemble Fusion for Small Object Detection
  3. TE-3 Da Huo (Nagoya University); Marc A. Kastner (Kyoto University); Tingwei Liu (Nagoya University)*; Yasutomo Kawanishi (RIKEN); Takatsugu Hirayama (University of Human Environments); Takahiro Komamizu (Nagoya University); Ichiro Ide (Nagoya University);
    Small Object Detection for Birds with Swin Transformer
  4. TE-4 Yosuke Shinya (Independent researcher)*
    BandRe: Rethinking Band-Pass Filters for Scale-wise Object Detection Evaluation
  5. TE-5 Kosuke Shigematsu (National Institute of Technology, Oita College)*
    Enhanced YOLOv7 for Small Bird Detection: Increasing Resolution,Test-time Augmentation and Weighted Boxes Fusion
  6. TE-6Q&A
  7. TE-7 Organizers and Winners
    Award Ceremony for the Challenge

Oral 2-1: Human-Object Interaction Detection and Pose Estimation (11:45-12:45)

Chair: Yeongnam Chae (Rakuten Institute of Technology)

  1. O2-1-1 Junwen Chen (The University of Electro-Communications)*; Keiji Yanai (The University of Electro-Communications)
    QAHOI: Query-based Anchors for Human-Object Interaction Detection
  2. O2-1-2 Zhe Xu (Waseda University)*; Yuan Li (Waseda University); Yuhong Li (Waseda University); Songlin Du (Southeast University); Takeshi Ikenaga (Waseda University)
    Hierarchical Spatio-temporal Neural Network with Displacement Based Refinement for Monocular Head Pose Prediction
  3. O2-1-3 Théo PetitJean (University of Bourgogne Franche-Comte)*; Zongwei Wu (University of Bourgogne Franche-Comte); Olivier Laligant (University of Burgundy); Cedric Demonceaux (University of Bourgogne Franche-Comte)
    QaQ: Robust 6D Pose Estimation via Quality-Assessed RGB-D Fusion

Poster 2 (14:00-16:00)

  1. P2-01 James-Andrew Sarmiento (University of the Philippines Diliman)*; Liushifeng Chen (Adravision); Prospero Naval Jr. (University of the Philippines Diliman)
    Multi-class Semantic Segmentation of Tooth Pathologies and Anatomical Structures on Bitewing and Periapical Radiographs
  2. P2-02 Shekhor Chanda (University of Manitoba)*; Yang Wang (Concordia University)
    Dynamic Transfer for Domain Adaptation in Crowd Counting
  3. P2-03 Shuki Shimizu (Nagoya Institute of Technology); Toru Tamaki (Nagoya Institute of Technology)*
    Joint Learning of Images and Videos with a Single Vision Transformer
  4. P2-04 Zhihan Zhuang (Waseda University)*; Yuan Li (Waseda University); Songlin Du (Southeast University); Takeshi Ikenaga (Waseda University)
    Intra-frame Skeleton Constraints Modeling and Grouping Strategy Based Multi-scale Graph Convolution Network for 3D Human Motion Prediction
  5. P2-05 Luis Acevedo-Bringas (Instituto Politecnico Nacional); Gibran Benitez-Garcia (The University of Electro-Communications)*; Jesus Olivares-Mercado (Instituto Politecnico Nacional); Hiroki Takahashi (The University of Electro-Communications)
    YOLOv5 with Mixed Backbone for Efficient Spatio-temporal Hand Gesture Localization and Recognition
  6. P2-06 Yasuto Nagase (NEC)*; Yasunori Babazaki (NEC); Katsuhiko Takahashi (NEC)
    Multi-plane Projection for Extending Perspective Image Object Detection Models to 360° Images
  7. P2-07 Agustin Castillo-Munguia (Instituto Politecnico Nacional); Gibran Benitez-Garcia (The University of Electro-Communications)*; Jesus Olivares-Mercado (Instituto Politecnico Nacional); Hiroki Takahashi (The University of Electro-Communications)
    Diabetic Retinopathy Grading based on a Sparse Network Fusion of Heterogeneous ConvNeXt Models with Category Attention
  8. P2-08 Shimpei Kobayashi (Ritsumeikan University)*; Akiyoshi Hizukuri (Ritsumeikan University); Ryohei Nakayama (Ritsumeikan University)
    Video Anomaly Detection Using Encoder-Decoder Networks with Video Vision Transformer and Channel Attention Blocks
  9. P2-09 Katarina Tolja (University of Zagreb)*; Zoran Kalafatic (University of Zagreb); Marko Subasic (University of Zagreb); Sven Loncaric (University of Zagreb)
    Enhancing Retail Product Recognition: Fine-grained Bottle Size Classification
  10. P2-10 Takuhiro Okada (University of Tsukuba)*; Yuantian Huang (University of Tsukuba); Guoqing Hao (University of Tsukuba); Satoshi Iizuka (University of Tsukuba); Kazuhiro Fukui (University of Tsukuba)
    Low-level Feature Aggregation Networks for Disease Severity Estimation of Coffee Leaves
  11. P2-11 Taiki Arakane (Kyushu Institute of Technology); Chihiro Kai (Kyushu Institute of Technology); Takeshi Saitoh (Kyushu Institute of Technology)*
    Can You Read Lips with a Masked Face?
  12. P2-12 Mariona Carós (Universitat de Barcelona)*; Ariadna Just (Institut Cartogràfic i Geològic de Catalunya); Santi Seguí (Universitat de Barcelona); Jordi Vitria (Universitat de Barcelona)
    Self-supervised Pre-training Boosts Semantic Scene Segmentation on LiDAR Data
  13. P2-13 Ahmed A Sabir (Universitat Politècnica de Catalunya)*
    Word to Sentence Visual Semantic Similarity for Caption Generation: Lessons Learned
  14. P2-14 Hugo Bulzomi (University of Côte d'Azur)*; Amélie Gruel (I3S, CNRS); Jean Martinet (I3S, University of Cote d'Azur, CNRS); Takeshi Fujita (AISIN); Yuta Nakano (IMRA EUROPE); Rémy Bendahan (IMRA Europe)
    Object Detection for Embedded Systems Using Tiny Spiking Neural Networks: Filtering Noise Through Visual Attention
  15. P2-15 Aleixo Cambeiro Barreiro (Fraunhofer HHI)*; Mariusz Trzeciakiewicz (Fraunhofer HHI); Anna Hilsmann (Fraunhofer HHI); Peter Eisert (HU Berlin)
    Automatic Reconstruction of Semantic 3D Models from 2D Floor Plans
  16. P2-16 Srijan Das (University of North Carolina at Charlotte)*; Michael S Ryoo (Stony Brook University/Google)
    Cross-modal Manifold Cutmix for Self-supervised Video Representation Learning
  17. P2-17 Juki Tanimoto (Chukyo University)*; Haruya Kyutoku (Aichi University of Technology); Keisuke Doman (Chukyo University); Yoshito Mekada (Chukyo University)
    Domain Adaptation from Visible-light to FIR with Reliable Pseudo Labels
  18. P2-18 Hansen Hendra (The University of Tokyo); Yubin Liu (The University of Tokyo)*; Ryoichi Ishikawa (The University of Tokyo); Takeshi Oishi (The University of Tokyo); Yoshihiro Sato (Kyoto University of Advanced Science)
    Quadruped Robot Platform for Selective Pesticide Spraying
  19. P2-19 Hiroto Harada (Kyushu University)*; Michihiro Mikamo (Hiroshima City University); Ryo Furukawa (Kindai University); Ryusuke Sagawa (AIST); Hiroshi Kawasaki (Kyushu University)
    Generalization of Pixel-wise Phase Estimation by CNN and Improvement of Phase-unwrapping by MRF Optimization for One-shot 3D Scan
  20. P2-20 Takuya Nakabayashi (Keio University)*; Hideo Saito (Keio University)
    Unsupervised Fall Detection on Edge Devices
  21. P2-21 Yutaro Hiraoka (The Japan Research Institute, Limited)*; Kazuhiro Fukui (University of Tsukuba)
    Deep Randomized Time Warping for Action Recognition
  22. P2-22 Djafer Yahia Messaoud Benchadi (University of Tsukuba)*; Bojan Batalo (University of Tsukuba); Kazuhiro Fukui (University of Tsukuba)
    Malware Detection using Kernel Constrained Subspace Method
  23. P2-23 Yusuf Hüseyin Şahin (Istanbul Technical University)*; Elvin Abdinli (Istanbul Technical University); Mustafa Arda Aydın (Istanbul Technical University); Gozde Unal (Istanbul Technical University)
    TinyPedSeg: A Tiny Pedestrian Segmentation Benchmark for Top-down Drone Images
  24. P2-24 Hao-Yu Hou (National Tsing Hua University)*; Mu-Yi Shen (National Tsing Hua University); Chia-Chi Hsu (National Tsing Hua University); En-Ming Huang (National Tsing Hua University); Yu-Chen Huang (National Tsing Hua University); Yu-Cheng Xia (National Tsing Hua University); Chien-Yao Wang (Academia Sinica); Chun-Yi Lee (National Tsing Hua University);
    Ensemble Fusion for Small Object Detection
  25. P2-25 Rui Ishiyama (NEC)*; Per Frøiland (Retrams AS); Stein-Asle Øvrebotn (Retrams AS)
    Automated Identification of Surgical Instruments without Tagging: Implementation in Real Hospital Work Environment

Oral 2-2: Applications of Computer Vision (16:00-17:00)

Chair: Chun-Yi Lee (National Tsing Hua University)

  1. O2-2-1 Marija Ivanovska (University of Ljubljana)*; Janez Perš (University of Ljubljana); Vitomir Struc (University of Ljubljana)
    TomatoDIFF: On-plant Tomato Segmentation with Denoising Diffusion Models
  2. O2-2-2 Jui-Teng Ho (National Taiwan University of Science and Technology)*; Gee-Sern Hsu (National Taiwan University of Science and Technology); Svetlana Yanushkevich (University of Calgary); Marina Gavrilova (University of Calgary)
    Outline Generation Transformer for Bilingual Scene Text Recognition
  3. O2-2-3 Kazuya Odagiri (Hirosaki University)*; Kazunori Onoguchi (Hirosaki University)
    Monocular Blind Spot Estimation with Occupancy Grid Mapping

Oral 2-3: Medical Imaging and Video Analysis (17:15-18:15)

Chair: Chia-Wen Lin (National Tsing Hua University)

  1. O2-3-1 Takumi Morita (Yamaguchi University)*; Xian-Hua Han (Yamaguchi University)
    Investigating Self-supervised Learning for Skin Lesion Classification
  2. O2-3-2 Hiromu Taketsugu (TTI-J)*; Norimichi Ukita (TTI-J)
    Uncertainty Criteria in Active Transfer Learning for Efficient Video-Specific Human Pose Estimation
  3. O2-3-3 Pere Gilabert (Universitat de Barcelona)*; Carolina Malagelada (University Hospital Vall d’Hebrón); Hagen Wenzek (CorporateHealth International); Jordi Vitria (Universitat de Barcelona); Santi Seguí (Universitat de Barcelona)
    Leveraging Embedding Information to Create Video Capsule Endoscopy Datasets

Banquet (19:00-22:00)

July 25 (Tuesday)

Registration (9:15-16:20)

IAPR Invited Talk 3 (9:30-10:30)

Chair: Ichiro Ide (Nagoya University)

  1. Prof. Kensaku Mori (Nagoya University)
    How AI Transforms the Medical Field: A Focus on Medical Imaging
    Click here for more information

Oral 3-1: Visual Interpretation and Recognition (10:45-11:45)

Chair: Marc A. Kastner (Kyoto University)

  1. O3-1-1 Niklas Penzel (Friedrich-Schiller-Universität Jena)*; Joachim Denzler (Friedrich-Schiller-Universität Jena)
    Interpreting Art by Leveraging Pre-trained Models
  2. O3-1-2 Felix Richards (Swansea University)*; Adeline Paiement (Université de Toulon); Xianghua Xie (Swansea University); Elisabeth Sola (Université de Strasbourg); Pierre-Alain Duc (Observatoire de Strasbourg)
    Panoptic Segmentation of Galactic Structures in LSB Images
  3. O3-1-3 Hyeon Joon Lee (Waseda University)*; Edgar Simo-Serra (Waseda University)
    Using Unconditional Diffusion Models in Level Generation for Super Mario Bros.

Award Ceremony (11:45-12:00)

MVA2023 Tutorials

Tutorial 1 (13:10-14:40)

Chair: Chun-Yi Lee (National Tsing Hua University)

  1. Dr. Michael S. Ryoo (Stony Brook University, Google DeepMind)
    Visual Representations in Robot Learning
Tutorial 2 (14:50-16:20)

Chair: Ryo Yonetani (CyberAgent)

  1. Dr. Shunsuke Saito (Reality Labs Research)
    Neural Fields in Visual Computing: Foundations and Applications
MIRU2023 Tutorials

Room: ACT CITY Hamamatu Main Hall

Participants registered for MVA 2023 can also join the MIRU 2023 tutorial (Japanese) at no additional cost.

Closing (16:20-16:30)