Conference program
(preliminary, small changes may still occur)

Click to download a PDF version of the program.

June 6
Tutorial & Workshops
(UPB's Library/Conference Center)
June 7
Main Conference
(AN010 “Radu Voinea”)
June 8
Main Conference
(AN010 “Radu Voinea”)
June 9
Industry Day
(AN010 “Radu Voinea”)
08:00-
08:40
Registration Registration Registration Registration 08:00-
08:40
08:40-
08:50
Welcome Day opening Day opening 08:40-
08:50
08:50-
09:40

Tutorial on
"Semantic INdexing"
by Georges Quénot
Chair: George Awad

Workshop on Multimedia Forensics and Security

Keynote
"Searching for a thing"
by Arnold W.M. Smeulders
and Ran Tao
Chair: Cees Snoek

Keynote
"Making cultural visits with
a smart mate
"
by Alberto del Bimbo
Chair: Nicu Sebe

Industry Keynotes
Xiaozheng Huang, Tencent
Matei Stroilă, HERE
Chair: TBD

08:50-
09:40
09:40-
10:00

Oral Session 1
Vision and Language
Chair: Horia Cucu

Oral Session 5
Best Papers Candidates
Chair: TBD

09:40-
10:00
10:00-
10:30

Coffee break (at the venue)

10:00-
10:30
10:30-
10:45

Tutorial on
"Zero-Example Video Search"
by Chong-Wah Ngo
Chair: George Awad

Workshop on Multimedia Forensics and Security

10:30-
10:45
10:45-
11:00

Spotlight presentations

10:45-
11:00
11:00-
11:30

Coffee break (at the venue)
in parallel posters of
Oral Session 1, 2, and 3

Coffee break (at the venue)
in parallel posters of
Oral Session 4 and 5

Coffee break (at the venue)

11:00-
11:30
11:30-
12:15

Tutorial on
"Ad Hoc Video Search"
by Georges Quénot
Chair: George Awad

Special Oral Session
Beyond semantics: multimodal understanding of subjective properties
Chair: Miriam Redi

Special Oral Session
Identifying and Linking Interesting Content in Large Audiovisual Repositories
Chairs: Maria Eskevich,
Roeland Ordelman

Panel session
TBD
Moderator: Martha Larson

11:30-
12:15
12:15-
12:30

Oral Session
Brave New Ideas
Chair: TBD

12:15-
12:30
12:30-
12:50

Lunch (at the venue)

Closing remarks and presentation of ICMR 2018

12:30-
12:50
12:50-
13:00

Free evening in
Bucharest

12:50-
13:00
13:00-
14:00

Lunch (at the venue)
in parallel posters of
Oral Session 1, 2, and 3

Lunch (at the venue)
in parallel posters of
Oral Session 4, and 5

13:00-
14:00
14:00-
15:00

Tutorial on
"Multimedia Event Detection"
by Cees Snoek
Chair: George Awad

Workshop on Wearable MultiMedia

Oral Session
Open Software
Chair: Mathias Lux

Oral Session 4
Cross-media Retrieval
Chair: TBD

14:00-
15:00
15:00-
15:20

Tutorial on
"Instance Search"
by Shin'ichi Satoh,
Duy-Dinh Le,
Vinh-Tiep Nguyen
Chair: George Awad

Spotlight presentations

15:00-
15:20
15:20-
15:40

Oral Session 2
Multimedia Indexing
Chair: TBD

Coffee break (at the venue)

15:20-
15:40
15:40-
16:00

Posters
Chair: TBD
and
Demos
Chair: TBD

Oral Session
Doctoral Symposium
Chair: TBD

15:40-
16:00
16:00-
16:20

Coffee break (at the venue)

16:00-
16:20
16:20-
16:30

Spotlight presentations

16:20-
16:30
16:30-
16:40

Tutorial on
"Video to Text"
by Cees Snoek
Chair: George Awad

Workshop on Wearable MultiMedia

16:30-
16:40
16:40-
17:00

Coffee break (at the venue)
in parallel posters of
Oral Session 1, 2, and 3

16:40-
17:00
17:00-
17:20

Oral Session 3
Multimedia Applications
Chair: TBD

17:00-
17:20
17:20-
18:00
17:20-
18:00
18:00-
18:05

Welcome Reception

Gala Dinner

18:00-
18:05
18:05-
18:20

Spotlight presentations

18:05-
18:20

18:20-


18:20-



ORAL SESSION 1: Vision and Language

Session Chair: Horia Cucu
Wednesday, June 7, AN010 “Radu Voinea”

9:40-10:45 Oral presentations Shizhe Chen, Jia Chen and Qin Jin,
Generating Video Descriptions with Topic Guidance
Christian Henning and Ralph Ewerth,
Estimating the Information Gap between Textual and Visual Representations
Kan Chen, Rama Kovvuri, Jiyang Gao and Ram Nevatia,
MSRC: Multimodal Spatial Regression with Semantic Context for Phrase Grounding
Junwei Liang, Lu Jiang, Deyu Meng and Alexander Hauptmann,
Leveraging Multi-modal Prior Knowledge for Large-scale Concept Learning in Noisy Web Data

10:45-11:00 Spotlight presentations Xing Xu, Fumin Shen, Yang Yang, Jie Shao and Zi Huang,
Transductive Visual-Semantic Embedding for Zero-shot Learning
Ricardo Carrapiço, Isabel Guimarães, Margarida Grilo, Sofia Cavaco and Joao Magalhaes,
3D Facial Video Retrieval and Management for Decision Support in Speech and Language Therapy
Thomas Mensink, Thomas Jongstra, Pascal Mettes and Cees Snoek,
Music-Guided Video Summarization: Linking Audio Visual Content using Quadratic Assignments

SPECIAL ORAL SESSION: Beyond Semantics: Multimodal Understanding of Subjective Properties

Session Chair: Miriam Redi
Wednesday, June 7, AN010 “Radu Voinea”

11:30-12:15 Oral presentations Darshan Santani, Salvador Ruiz-Correa and Daniel Gatica-Perez,
Insiders and Outsiders: Comparing Urban Impressions between Population Groups
Claudio Baecchi, Tiberio Uricchio, Marco Bertini and Alberto Del Bimbo,
Deep Sentiment Features of Context and Faces for Affective Video Analysis
Jiarui Gao, Yanwei Fu, Yu-Gang Jiang and Xiangyang Xue,
Frame-Transformer Emotion Classification Network

ORAL SESSION: Brave New Ideas

Session Chair: TBD
Wednesday, June 7, AN010 “Radu Voinea”

12:15-13:00 Oral presentations Jaeyoung Choi, Martha Larson, Xinchao Li, Kevin Li, Gerald Friedland and Alan Hanjalic,
The Geo-Privacy Bonus of Popular Photo Enhancements
Eduardo Nigri and Ognjen Arandjelovic,
Salient Information Retrieval from Big Astronomy Data
Nitish Nag, Vaibhav Pandey and Ramesh Jain,
Health Multimedia: Lifestyle Recommendations Based on Diverse Multimedia Observations

ORAL SESSION: Open Software

Session Chair: Mathias Lux
Wednesday, June 7, AN010 “Radu Voinea”

14:00-15:20 Oral presentations Lucas Valem and Daniel C. G. Pedronette,
An Unsupervised Distance Learning Framework for Multimedia Retrieval
Konstantin Pogorelov, Michael Riegler, Pål Halvorsen and Carsten Griwodz,
ClusterTag: Interactive Visualization, Clustering and Tagging Tool for Big Image Collections
Chris A. Mattmann and Madhav Sharan,
Scalable Hadoop-Based Pooled Time Series of Big Video Data from the Deep Web
Federico Bartoli, Giuseppe Lisanti, Lorenzo Seidenari and Alberto Del Bimbo,
PACE: Prediction-based Annotation for Crowded Environments

ORAL SESSION 2: Multimedia Indexing

Session Chair: TBD
Wednesday, June 7, AN010 “Radu Voinea”

15:20-16:20 Oral presentations Rao Muhammad Anwer, Fahad Shahbaz Khan, Joost Van De Weijer and Jorma Laaksonen,
TEX-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition
Jie Lin, Olivier Morere, Vijay Chandrasekhar, Antoine Veillard, Ling-Yu Duan and Hanlin Goh,
DeepHash for Image Instance Retrieval: Getting Regularization, Depth and Fine-Tuning Right
André Mourão and Joao Magalhaes,
Balanced Search Space Partitioning for Distributed Redundant Media Indexing
Dayan Wu, Zheng Lin, Mingzhen Ye, Bo Li and Weiping Wang,
Deep Supervised Hashing for Multi-Label and Large-Scale Image Retrieval

16:20-16:40 Spotlight presentations Fabien André, Anne-Marie Kermarrec and Nicolas Le Scouarnec,
Accelerated Nearest Neighbor Search with Quick ADC
Christian Eggert, Dan Zecha, Stephan Brehm and Rainer Lienhart,
Improving Small Object Proposals for Company Logo Detection
Rui Yang, Yuliang Shi and Xin-Shun Xu,
Discrete Multi-view Hashing for Effective Image Retrieval
Omar Seddati, Stéphane Dupont and Saïd Mahmoudi,
Quadruplet Networks for Sketch-Based Image Retrieval

ORAL SESSION 3: Multimedia Applications

Session Chair: TBD
Wednesday, June 7, AN010 “Radu Voinea”

17:00-18:05 Oral presentations Karthik Yadati, Martha Larson, Cynthia Liem and Alan Hanjalic,
On Identifying Music for Daily Activities
Martin Pichl, Eva Zangerle and Günther Specht,
Improving Context-Aware Music Recommender Systems: Beyond the Pre-filtering Approach
Michal Koperski, Slawomir Bak and Peter Carr,
Groups Re-identification with Temporal Context
Ionuț Cosmin Duță, Bogdan Ionescu, Kiyoharu Aizawa and Nicu Sebe,
Simple, Efficient and Effective Encodings of Local Deep Features for Video Action Recognition

18:05-18:20 Spotlight presentations Olga Slizovskaia, Emilia Gomez and Gloria Haro,
Musical Instrument Recognition in User-Generated Videos using a Multimodal Convolutional Neural Network Architecture
Gijs Overgoor, Masoud Mazloom, Robert Rietveld, Marcel Worring and Willemijn van Dolen,
A Spatio-Temporal Category Representation for Brand Popularity Prediction
Miriam Redi, Frank Liu and Neil O'Hare,
Bridging the Aesthetic Gap: The Wild Beauty of Web Imagery

ORAL SESSION 5: Best Papers Candidates

Session Chair: TBD
Thursday, June 8, AN010 “Radu Voinea”

9:40-11:00 Oral presentations Mohammad Soleymani, Michael Riegler and Pål Halvorsen,
Multimodal Analysis of Multimedia Search Intent - Intent Recognition in Multimedia Search from User Behavior and Multimedia Content
Olivier Morere, Jie Lin, Antoine Veillard, Vijay Chandrasekhar, Ling-Yu Duan and Tomaso Poggio,
Nested Invariance Pooling and RBM Hashing for Image Instance Retrieval
Yusuke Uchida, Yuki Nagai, Shigeyuki Sakazawa and Shin'Ichi Satoh,
Embedding Watermarks into Deep Neural Networks
Christina Boididou, Symeon Papadopoulos, Lazaros Apostolidis and Yiannis Kompatsiaris,
Learning to Detect Misleading Content on Twitter

SPECIAL ORAL SESSION: Identifying and Linking Interesting Content in Large Audiovisual Repositories

Session Chairs: Maria Eskevich and Roeland Ordelman
Thursday, June 8, AN010 “Radu Voinea”

11:30-13:00 Oral presentations Xuanchong Li and Alexander Hauptmann,
Understanding Videos Through Natural Language: Video Hyperlinking with Language-aided Multimodal Retrieval
Zhi-Qi Cheng, Hao Zhang, Xiao Wu and Chong-Wah Ngo,
On the Selection of Anchors and Targets for Video Hyperlinking
Petra Galuščáková, Michal Batko, Jan Čech, Jiří Matas, David Novák and Pavel Pecina,
Visual Descriptors in Methods for Video Hyperlinking
Remi Bois, Guillaume Gravier, Eric Jamet, Emmanuel Morin, Maxime Robert and Pascale Sébillot,
Linking Multimedia Content for Efficient News Browsing
Yang Liu, Zhonglei Gu, Yiu-Ming Cheung and Kien A. Hua,
What Makes A Video Intriguing? - Media Interestingness Analysis via Multi-view Manifold Learning
Keith Curtis, Gareth Jones and Nick Campbell,
Utilising High-Level Features in Summarisation of Academic Presentations

ORAL SESSION 4: Cross-media Retrieval

Session Chair: TBD
Thursday, June 8, AN010 “Radu Voinea”

14:00-15:00 Oral presentations Aliaksandr Siarohin, Gloria Zen, Cveta Majtanovic, Xavier Alameda-Pineda, Elisa Ricci and Nicu Sebe,
How to Make an Image More Memorable? A Deep Style Transfer Approach
Fabian Junkert, Markus Eberts, Adrian Ulges and Ulrich Schwanecke,
Cross-modal Image-Graphics Retrieval by Neural Transfer Learning
Samet Hicsonmez, Nermin Samet, Fadime Sener and Pinar Duygulu,
DRAW: Deep Networks for Recognizing Styles of Artists Who Illustrate Children’s Books
Ines Chami, Youssef Tamaazousti and Hervé Le Borgne,
AMECON: Abstract Meta-Concept Features for Text-Illustration

15:00-15:20 Spotlight presentations Elaheh Momeni, Reza Rawassizadeh and Eytan Adar,
Leveraging Semantic Facets for Adaptive Ranking of Social Comments
Zhanxiong Wang, Keke He, Yanwei Fu, Yugang Jiang, Rui Feng and Xiangyang Xue,
Multi-Task Deep Neural Network for Joint Face Recognition and Facial Attribute Prediction
Kuikui Wang, Lu Yang, Gongping Yang, Xin Luo, Kun Su and Yilong Yin,
Finger Vein Image Retrieval via Coding Scale-varied Superpixel Feature
Haoyue Shi, Jia Chen and Alexander Hauptmann,
Joint Saliency Estimation and Matching using Image Regions for Geo-Localization of Online Video

POSTERS

Session Chair: TBD
Thursday, June 8, Hallway

15:40-18:00 Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Teddy Furon and Ondrej Chum,
Panorama to Panorama Matching for Location Recognition
Damianos Galanopoulos, Foteini Markatopoulou, Vasileios Mezaris and Ioannis Patras,
Concept Language Models and Event-based Concept Number Selection for Zero-example Event Detection
Shilun Lin, Pengfei Xiong and Hailong Liu,
Tiny Transform Net for Mobile Image Stylization
Foteini Markatopoulou, Damianos Galanopoulos, Vasileios Mezaris and Ioannis Patras,
Query and Keyframe Representations for Ad-hoc Video Search
Wei-Ta Chu and Wei-Wei Li,
Manga FaceNet: Face Detection in Manga based on Deep Neural Network
Vedran Vukotic, Christian Raymond and Guillaume Gravier,
Generative Adversarial Networks for Multimodal Representation Learning in Video Hyperlinking
Giuseppe Amato, Fabio Carrara, Fabrizio Falchi and Claudio Gennaro,
Efficient Indexing of Regional Maximum Activations of Convolutions using Full-Text Search Engines
Junkang Zhang, Siyu Xia, Ming Shao and Yun Fu,
Family Photo Recognition via Multiple Instance Learning
Shan Sun, Feng Wang, Qi Liang and Liang He,
TaiChi: A Fine-Grained Action Detection Dataset
Keiji Yanai and Ryosuke Tanno,
Conditional Fast Style Transfer Network
Anuvabh Dutt, Denis Pellerin and Georges Quénot,
Improving Image Classification using Coarse and Fine Labels
Mridula Verma and K. K. Shukla,
Fast Multi-Modal Unified Sparse Representation Learning
Wei-Ta Chu and Samuel Situmeang,
Badminton Video Analysis based on Spatiotemporal and Stroke Features
Xing Xu, Fumin Shen, Yang Yang, Jie Shao and Zi Huang,
Transductive Visual-Semantic Embedding for Zero-shot Learning
Joao Magalhaes and Sofia Cavaco,
3D Facial Video Retrieval and Management for Decision Support in Speech and Language Therapy
Thomas Mensink, Thomas Jongstra, Pascal Mettes and Cees Snoek,
Music-Guided Video Summarization: Linking Audio Visual Content using Quadratic Assignments
Fabien André, Anne-Marie Kermarrec and Nicolas Le Scouarnec,
Accelerated Nearest Neighbor Search with Quick ADC
Christian Eggert, Dan Zecha, Stephan Brehm and Rainer Lienhart,
Improving Small Object Proposals for Company Logo Detection
Rui Yang, Yuliang Shi and Xin-Shun Xu,
Discrete Multi-view Hashing for Effective Image Retrieval
Omar Seddati, Stéphane Dupont and Saïd Mahmoudi,
Quadruplet Networks for Sketch-Based Image Retrieval
Olga Slizovskaia, Emilia Gomez and Gloria Haro,
Musical Instrument Recognition in User-Generated Videos using a Multimodal Convolutional Neural Network Architecture
Gijs Overgoor, Masoud Mazloom, Robert Rietveld, Marcel Worring and Willemijn van Dolen,
A Spatio-Temporal Category Representation for Brand Popularity Prediction
Miriam Redi, Frank Liu and Neil O'Hare,
Bridging the Aesthetic Gap: The Wild Beauty of Web Imagery
Elaheh Momeni, Reza Rawassizadeh and Eytan Adar,
Leveraging Semantic Facets for Adaptive Ranking of Social Comments
Zhanxiong Wang, Keke He, Yanwei Fu, Yugang Jiang, Rui Feng and Xiangyang Xue,
Multi-Task Deep Neural Network for Joint Face Recognition and Facial Attribute Prediction
Kuikui Wang, Lu Yang, Gongping Yang, Xin Luo, Kun Su and Yilong Yin,
Finger Vein Image Retrieval via Coding Scale-varied Superpixel Feature
Haoyue Shi, Jia Chen and Alexander Hauptmann,
Joint Saliency Estimation and Matching using Image Regions for Geo-Localization of Online Video

DEMONSTRATIONS

Session Chair: TBD
Thursday, June 8, Hallway

15:40-18:00 Andrea Ceroni, Vassilios Solachidis, Claudia Niederée, Olga Papadopoulou and Vasileios Mezaris,
Expo: An Expectation-oriented System for Selecting Important Photos from Personal Collections
Luca Rossetto, Ivan Giangreco, Claudiu Tănase and Heiko Schuldt,
Multimodal Video Retrieval with the 2017 IMOTION System
Kashif Ahmad, Michael Riegler, Ans Riaz, Nicola Conci, Duc-Tien Dang-Nguyen and Pål Halvorsen,
The JORD System - Linking Sky and Social Multimedia Data to Natural and Technological Disasters
Mathias Lux, Michael Riegler, Pål Halvorsen and Glenn Mac Stravic,
LireSolr - A Visual Information Retrieval Server
Chrysa Collyda, Evlampios Apostolidis, Alexandros Pournaras, Foteini Markatopoulou, Vasileios Mezaris and Ioannis Patras,
VideoAnalysis4ALL: An on-line Tool for the Automatic Fragmentation and Concept-based Annotation, and the Interactive Exploration of Videos
Kai Uwe Barthel, Nico Hezel and Klaus Jung,
Visually Browsing Millions of Images using Image Graphs

ORAL SESSION: Doctoral Symposium

Session Chair: TBD
Thursday, June 8, AN010 “Radu Voinea”

15:40-17:00 Oral presentations Shengsheng Qian, Tianzhu Zhang and Changsheng Xu,
A Generic Framework for Social Event Analysis
Stefan Petscharnig,
Semi-Automatic Retrieval of Relevant Segments from Laparoscopic Surgery Videos
Yash Garg and Silvestro Roberto Poccia,
On the Effectiveness of Distance Measures for Similarity Search in Multi-Variate Sensory Data
Karim Aderghal, Jenny Benois-Pineau and Karim Afdel,
Classification of sMRI for Alzheimer’s disease Diagnosis with CNN: Single Siamese Networks with 2D+epsilon approach and fusion on ADNI

INDUSTRY KEYNOTES

Session Chair: TBD
Friday, June 9, AN010 “Radu Voinea”

8:50-11:00 Xiaozheng Huang, Tencent
With 5G Approaching, How will Audio/Video Technology that Serves 800 Million QQ Users Bring Forth New Ideas
Matei Stroilă, HERE
Information Retrieval from Multi-Sensor Data for Enriching Location Services at HERE Technologies

PANEL SESSION

Moderator: Martha Larson
Friday, June 9, AN010 “Radu Voinea”

11:30-12:30

Association for Computing Machinery

University Politehnica of Bucharest

University of Trento