Listen, Look, and Find the One

Person search with one portrait, which attempts to search the targets in arbitrary scenes using one portrait image at a time, is an essential yet unexplored problem in the multimedia field. Existin...

[1]  Guillaume-Alexandre Bilodeau,et al.  Domain-Specific Face Synthesis for Video Face Recognition From a Single Sample Per Person , 2018, IEEE Transactions on Information Forensics and Security.

[2]  Yunde Jia,et al.  Temporal Action Localization in Untrimmed Videos Using Action Pattern Trees , 2019, IEEE Transactions on Multimedia.

[3]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4]  Edward A. Patrick,et al.  Review of Pattern Recognition in Medical Diagnosis and Consulting Relative to a New System Model , 1974, IEEE Trans. Syst. Man Cybern..

[5]  Andrew Zisserman,et al.  Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[7]  Ruimin Hu,et al.  Multi-Correlation Filters With Triangle-Structure Constraints for Object Tracking , 2019, IEEE Transactions on Multimedia.

[8]  Albert Y. Zomaya,et al.  H-PARAFAC: Hierarchical Parallel Factor Analysis of Multidimensional Big Data , 2017, IEEE Transactions on Parallel and Distributed Systems.

[9]  Joon Son Chung,et al.  VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.

[10]  Nasser M. Nasrabadi,et al.  Text-Independent Speaker Verification Using 3D Convolutional Neural Networks , 2017, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[11]  Luc Van Gool,et al.  AENet: Learning Deep Audio Features for Video Analysis , 2017, IEEE Transactions on Multimedia.

[12]  Shifeng Zhang,et al.  WIDER Face and Pedestrian Challenge 2018: Methods and Results , 2019, ArXiv.

[13]  Jean-Luc Dugelay,et al.  KinectFaceDB: A Kinect Database for Face Recognition , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[14]  Jian Liu,et al.  iQIYI-VID: A Large Dataset for Multi-modal Person Identification , 2018, ArXiv.

[15]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[16]  M. Corbetta,et al.  Control of goal-directed and stimulus-driven attention in the brain , 2002, Nature Reviews Neuroscience.

[17]  Wu Liu,et al.  Learning Efficient Spatial-Temporal Gait Features with Deep Learning for Human Identification , 2018, Neuroinformatics.

[18]  Federico Tombari,et al.  Query-Guided End-To-End Person Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Qi Tian,et al.  SIFT Meets CNN: A Decade Survey of Instance Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Alan L. Yuille,et al.  Semi-Supervised Sparse Representation Based Classification for Face Recognition With Insufficient Labeled Samples , 2016, IEEE Transactions on Image Processing.

[21]  Yuxiao Hu,et al.  MS-Celeb-1M: Challenge of Recognizing One Million Celebrities in the Real World , 2016, IMAWM.

[22]  Joon Son Chung,et al.  VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.