The 6th AI City Challenge

The 6th edition of the AI City Challenge specifically focuses on problems in two domains where there is tremendous unlocked potential at the intersection of computer vision and artificial intelligence: Intelligent Traffic Systems (ITS), and brick and mortar retail businesses. The four challenge tracks of the 2022 AI City Challenge received participation requests from 254 teams across 27 countries. Track 1 addressed city-scale multi-target multi-camera (MTMC) vehicle tracking. Track 2 addressed natural-language-based vehicle track retrieval. Track 3 was a brand new track for naturalistic driving analysis, where the data were captured by several cameras mounted inside the vehicle focusing on driver safety, and the task was to classify driver actions. Track 4 was another new track aiming to achieve retail store automated checkout using only a single view camera. We released two leader boards for submissions based on different methods, including a public leader board for the contest, where no use of external data is allowed, and a general leader board for all submitted results. The top performance of participating teams established strong baselines and even outperformed the state-of-the-art in the proposed challenge tracks.

[1]  Haisheng Su,et al.  MVP: Robust Multi-View Practice for Driving Action Localization , 2022, 2022 IEEE 5th International Conference on Information Systems and Computer Aided Education (ICISCAE).

[2]  A. Herout,et al.  PersonGONE: Image Inpainting for Automated Checkout Solution , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3]  Errui Ding,et al.  Box-Grained Reranking Matching for Multi-Camera Multi-Target Tracking , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Duong Nguyen-Ngoc Tran,et al.  DeepACO: A Robust Deep Learning-based Automatic Checkout System , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Guanbin Li,et al.  A Multi-granularity Retrieval System for Natural Language-based Vehicle Retrieval , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Synh Viet-Uyen Ha,et al.  Tracked-Vehicle Retrieval by Natural Language Descriptions With Domain Adaptive Knowledge , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Keval Doshi,et al.  Federated Learning-based Driver Activity Recognition for Edge Devices , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  D. Anastasiu,et al.  Key Point-Based Driver Activity Recognition , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9]  Hui Yao,et al.  City-Scale Multi-Camera Vehicle Tracking based on Space-Time-Appearance Features , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Synh Viet-Uyen Ha,et al.  Multi-Camera Multi-Vehicle Tracking with Domain Generalization and Contextual Constraints , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[11]  Jianrong Xu,et al.  Multi-Camera Vehicle Tracking Based on Occlusion-aware and Inter-vehicle Information , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Tencent Youtu Lab,et al.  Stargazer: A Transformer-based Driver Action Detection System for Intelligent Transportation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[13]  J. Beyerer,et al.  Improving Multi-Target Multi-Camera Tracking by Track Refinement and Completion , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14]  Shiyi Zhang,et al.  Multi-Camera Vehicle Tracking System for AI City Challenge 2022 , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]  P. D. With,et al.  Density-Guided Label Smoothing for Temporal Localization of Driving Actions , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  Guanchen Ding,et al.  A Coarse-to-Fine Boundary Localization method for Naturalistic Driving Action Recognition , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  W. Li,et al.  MV-TAL: Mulit-view Temporal Action Localization in Naturalistic Driving , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[18]  Chuong H. Nguyen,et al.  Learning Generalized Feature for Temporal Action Detection: Application for Natural Driving Action Recognition Challenge , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  Khac-Hoai Nam Bui,et al.  An Effective Temporal Localization Method with Multi-View 3D Action Recognition for Untrimmed Naturalistic Driving Videos , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[20]  Hang Zhao,et al.  PAND: Precise Action Recognition on Naturalistic Driving , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Tam V. Nguyen,et al.  Text Query based Traffic Video Event Retrieval with Global-Local Fusion Embedding , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Duong Nguyen-Ngoc Tran,et al.  A Robust Traffic-Aware City-Scale Multi-Camera Vehicle Tracking Of Vehicles , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[23]  Boxun Li,et al.  Symmetric Network with Spatial Relationship Modeling for Natural Language-based Vehicle Retrieval , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Md. Istiak Hossain Shihab,et al.  VISTA: Vision Transformer enhanced by U-Net and Image Colorfulness Frame Filtration for Automatic Retail Checkout , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  Y. Adu-Gyamfi,et al.  A Region-Based Deep Learning Approach to Automated Retail Checkout , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[26]  Fei Su,et al.  OMG: Observe Multiple Granularities for Natural Language-Based Vehicle Retrieval , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[27]  Senem Velipasalar Gursoy,et al.  Synthetic distracted driving (SynDD1) dataset for analyzing distracted behaviors and various gaze zones of a driver , 2022, Data in brief.

[28]  Liang Zheng,et al.  Attribute Descent: Simulating Object-Centric Datasets on the Content Level and Beyond , 2022, IEEE transactions on pattern analysis and machine intelligence.

[29]  Trevor Darrell,et al.  A ConvNet for the 2020s , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Ping Luo,et al.  ByteTrack: Multi-Object Tracking by Associating Every Detection Box , 2021, ECCV.

[31]  Chang Zhou,et al.  Connecting Language and Vision for Natural Language-Based Vehicle Retrieval , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[32]  Jiasheng Tang,et al.  City-Scale Multi-Camera Vehicle Tracking Guided by Crossroad Zones , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33]  S. Sclaroff,et al.  The 5th AI City Challenge , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34]  Stan Sclaroff,et al.  CityFlow-NL: Tracking and Retrieval of Vehicles at City Scale by Natural Language Descriptions , 2021, ArXiv.

[35]  Chien-Yao Wang,et al.  Scaled-YOLOv4: Scaling Cross Stage Partial Network , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Naman K. Gupta,et al.  ultralytics/yolov5: v3.1 - Bug Fixes and Performance Improvements , 2020 .

[37]  A. Yuille,et al.  DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Liang Zheng,et al.  The 4th AI City Challenge , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[39]  Christoph Feichtenhofer,et al.  X3D: Expanding Architectures for Efficient Video Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Liang Zheng,et al.  Simulating Content Consistent Vehicle Datasets with Attribute Descent , 2019, ECCV.

[41]  Shuo Wang,et al.  PAMTRI: Pose-Aware Multi-Task Learning for Vehicle Re-Identification Using Highly Randomized Synthetic Data , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Jenq-Neng Hwang,et al.  CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Marwan Mattar,et al.  Unity: A General Platform for Intelligent Agents , 2018, ArXiv.

[44]  Jenq-Neng Hwang,et al.  The 2018 NVIDIA AI City Challenge , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[45]  Rogério Schmidt Feris,et al.  Dialog-based Interactive Image Retrieval , 2018, NeurIPS.

[46]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Liang Zheng,et al.  Improving Person Re-identification by Attribute and Identity Learning , 2017, Pattern Recognit..

[48]  Dietrich Paulus,et al.  Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[49]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[50]  Trevor Darrell,et al.  Natural Language Object Retrieval , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Alan L. Yuille,et al.  Generation and Comprehension of Unambiguous Object Descriptions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[53]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[54]  Chang Huang,et al.  Learning to associate: HybridBoosted multi-target tracker for crowded scene , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[55]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2008, J. Assoc. Inf. Sci. Technol..

[56]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[57]  Jenq-Neng Hwang,et al.  The 2019 AI City Challenge , 2019, CVPR Workshops.

[58]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.