"Searching for a thing" Arnold W.M. Smeulders & Ran Tao (University of Amsterdam)

Abstract: For humans, one picture usually suffices to identify an object of search. I am looking for this little girl, have you seen her?, or Do you have such another one? are two ways to specify a target even to someone who has never seen the object of search before. Searching from one example in digital multimedia retrieval is a hard problem. From the one example one needs to derive an accurate estimate of all accidental variations in the target picture as well as the structural variation of the target in all other potential pictures. From the one example one needs to derive an accurate estimate of all accidental variations the target instance might have.

We will formulate the problem, and argue that similarity focus in feature space is at the core of instance search algorithms, especially when searching for geometrically rigid layouts: logos, buildings and scenes, who are effectively flat and one-sided. To generalize to rigid 3D-objects the variability in view point variations is so big that one needs to learn about other viewpoints in general before one can start with searching from one example, especially when searching for known object-types. To generalize further, first arbitrarily shaped and non-rigid objects are a hurdle yet to take, followed by searching for unknown objects. Current research is on learning more about object variations in general. Searching things from one example is not over yet, but there is a reward waiting in the sense that one example search is useful in other branches of retrieval.

Bio: Arnold W.M. Smeulders is in charge of COMMIT/, a nation-wide, very large public-private research program distributed over the Netherlands on large-scale data, content, sensing and interaction. And he is professor at the University of Amsterdam UvA for research in the theory and practice of computer vision. The group’s search engines have received a top-three performance for all 14 years in the international TREC-vid competition for image categorisation. He was recipient of a Fulbright fellowship at Yale University, and visiting professor in Hong Kong, Tuskuba, Modena, Cagliari and Florida. He was co-founder of Euvision Technologies BV, a company spin off from the UvA. He is currently director of the Qualcomm - UvA and the Bosch - UvA labs. He is associate editor of the IJCV. He is fellow of the International Association of Pattern Recognition and elected member of the Academia Europaea (AE).

Bio: Ran Tao received the M.Sc. degree in computer science (2011) from Leiden University, The Netherlands and the Ph.D. degree in computer science (2017) from the University of Amsterdam, The Netherlands. He is currently a postdoctoral researcher at the QUVA Lab, the joint research lab of Qualcomm and the University of Amsterdam on deep learning and computer vision. His research interests include computer vision and machine learning, with a focus on instance search, object tracking and deep learning.

"Making cultural visits with a smart mate" Alberto del Bimbo (MICC University of Florence)

Abstract: Computer vision technology may enable to bridge the experiential gap between the cultural and emotional experience of the visitors in museums or cultural heritage sites. While multimedia technology and applications coupled with enabling sensors and mobile devices appear to be capable of bringing rich information directly to the individual user and support networking between people, computer vision permits to interpret the interests of the visitor and effortlessly trigger the delivery of the appropriate information at the right time. We present a prototypal smart audio guide based on computer vision that adapts to the actions and interests of individual visitors, perceives the context and operates appropriately according to it. The system is capable to work in real-time on a mobile device and uses Convolutional Neural Network (CNN) technology to perform object classification and localization. The system has been deployed on NVIDIA Shield Tablet K1, and tested in real world environments in a museum visit and urban walk.

Bio: Prof. Del Bimbo is Full Professor at the Department of Information Engineering of University of Firenze, leading a research team that investigates cutting-edge solutions in the fields of computer vision, multimedia content analysis, indexing and retrieval, and advanced interactivity. He is the author of over 350 scientific publications and has been the coordinator of many research and industrial projects. He was the Program Chair of the Int’l Conferences on Pattern Recognition ICPR 2016, and ICPR 2012, and ACM Multimedia 2008, and the General Chair of the European Conference on Computer Vision ECCV 2012, the ACM Int’l Conference on Multimedia Retrieval ICMR 2011, ACM Multimedia 2010, and IEEE ICMCS 1999, the Int’l Conference on Multimedia Computing & Systems. He is the Editor in Chief of ACM TOMM Transactions on Multimedia Computing, Communications, and Applications and Associate Editor of Multimedia Tools and Applications, Pattern Analysis and Applications journals. He was Associate Editor of IEEE Transactions on Pattern Analysis and Machine Intelligence and IEEE Transactions on Multimedia and also served as the Guest Editor of many Special Issues in highly ranked journals. Prof. Del Bimbo is IAPR Fellow and the recipient of the 2016 ACM SIGMM Award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications.

Association for Computing Machinery

University Politehnica of Bucharest

University of Trento