IAPR Invited Talks

July 25th, 09:30–10:30

Self-supervision for Learning from the Bottom Up

IAPR Invited Speaker

Biography

Alexei (Alyosha) Efros is a professor of computer science at UC Berkeley and member of the BAIR lab. Prior to that, he was nine years on the faculty of Carnegie Mellon University, and has also been affiliated with École Normale Supérieure/INRIA and University of Oxford. His research is in the area of computer vision and computer graphics, especially at the intersection of the two. He is particularly interested in using data-driven techniques to tackle problems where large quantities of unlabeled visual data are readily available. Efros received his PhD in 2003 from UC Berkeley. He is a recipient of the Sloan Fellowship (2008), Guggenheim Fellowship (2008), Okawa Grant (2008), SIGGRAPH Significant New Researcher Award (2010), 3 PAMI-TC Helmholtz Test-of-Time Prizes (1999,2003,2005), the ACM Prize in Computing (2016), and Diane McEntyre Award for Excellence in Teaching Computer Science (2019). He likes Paris and gelato.

Abstract

Why do self-supervised learning? A common answer is: "because data labeling is expensive." In this talk, I will argue that there are other, perhaps more fundamental reasons for working on self-supervision. First, it should allow us to get away from the tyranny of top-down semantic categorization and force meaningful associations to emerge naturally from the raw sensor data in a bottom-up fashion. Second, it should allow us to ditch fixed datasets and enable continuous, online learning, which is a much more natural setting for real-world agents. Third, and most intriguingly, there is hope that it might be possible to force a self-supervised task curriculum to emerge from first principles, even in the absence of a pre-defined downstream task or goal, similar to evolution. In this talk, I will touch upon these themes to argue that, far from running its course, research in self-supervised learning is only just beginning.

July 26th, 09:00–10:00

How Technology Can Help the Visually Impaired Navigate the World

IAPR Invited Speaker

Dr. Chieko Asakawa
IBM

Biography

Chieko Asakawa is an IBM Fellow, working in the area of accessibility. Her initial contribution to the field started from braille digitalization and moved onto the Web accessibility, including the world’s first practical voice browser. Today, Chieko is focusing on real world accessibility to help the visually impaired understand their surroundings and navigate the world by the power of AI. She has been serving as an IBM Distinguished Service Professor at Carnegie Mellon University since 2014. She will concurrently serve as Chief Executive Director of the Japanese National Museum of Emerging Science and Innovation (Miraikan) from April 2021. In 2013, the government of Japan awarded the Medal of Honor with Purple Ribbon to Chieko for her outstanding contributions to accessibility research. She also received American Foundation for the Blind 2020 Helen Keller Achievement Award. She was elected as a foreign member of the US National Academy of Engineering in 2017, inducted into the National Inventors Hall of Fame (NIHF) in 2019.

Abstract

Technology has greatly improved the lives of people with visual impairments. Voice-based web access has offered a completely new type of information resources to the visually impaired. Sensors and smartphones are now offering new experiences to navigate in the real world without vision. Computer vision-based applications are helping them recognize objects, pedestrians, scenes and so on.

Chieko is an accessibility researcher and is blind herself. In this talk, she will present how technology has improved the quality of her life, showing various examples through her daily life. She will also describe her latest work on NavCog, a smartphone-based navigation system, and AI Suitcase, a navigational robot to help the blind navigate in the world. Her talk will include new needs that became apparent under the pandemic and present possible ways to solve them. Finally, she will discuss how we can accelerate the implementation of new technologies into our society.

July 27th, 10:15–11:15

Leaning to Enhance Images

IAPR Invited Speaker

Prof. Ming-Hsuan Yang
University of California at Merced

Biography

Ming-Hsuan Yang is a professor in Electrical Engineering and Computer Science at University of California, Merced and a research scientist at Google. He received the Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign in 2000. He serves as a program co-chair for IEEE International Conference on Computer Vision (CVPR) in 2019 as well as Asian Conference on Computer Vision (ACCV) in 2014, and a general co-chair for Asian Conference on Computer Vision in 2016. He serves as an associate editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) from 2007 to 2011, International Journal of Computer Vision (IJCV), Computer Vision and Image Understanding (CVIU), Image and Vision Computing (IVC), and Journal of Artificial Intelligence Research (JAIR). He received numerous paper awards from CVPR, ACCV, and UIST. Yang received the Google faculty award in 2009, and Distinguished Early Career Research Award from the UC Merced senate in 2011, CAREER award from the National Science Foundation in 2012, and Distinguished Research Award from UC Merced Senate in 2015. He is a Fellow of IEEE.

Abstract

In this talk, I will first review our work on image deblurring, dehazing, super-resolution, and related topics. The underlying theme is to learn from image data for these tasks based on classic approaches and learning methods. I will then present our recent work on frame interpolation, reflectance removal, optical flow, and image synthesis based on deep learning models. When time allows, I will also discuss our work for other vision tasks.