CVPR 2018 Workshop On Visual Understanding of Humans in Crowd Scene

Introduction

Developing solutions to comprehensive human visual understanding in the wild scenarios, regarded as one of the most fundamental problems in compute vision, could have a crucial impact in many industrial application domains, such as autonomous driving, virtual reality, video surveillance, human-computer interaction and human behavior analysis. For example, human parsing and pose estimation are often regarded as the very first step for higher-level activity/event recognition and detection. Nonetheless, a large gap seems to exist between what is needed by the real-life applications and what is achievable based on modern computer vision techniques. The goal of this workshop is to allow researchers from the fields of human visual understanding and other disciplines to present their progress, communication and co-develop novel ideas that potentially shape the future of this area and further advance the performance and applicability of correspondingly built systems in real-world conditions.

To stimulate the progress on this research topic and attract more talents to work on this topic, we will also provide a first standard human parsing and pose benchmark on a new large-scale Look Into Person (LIP) dataset. This dataset is both larger and more challenging than similar previous ones in the sense that the new dataset contains 50,000 images with elaborated pixel-wise annotations with comprehensive 19 semantic human part labels and 2D human poses with 16 dense key points. The images collected from the real-world scenarios contain humans appearing with challenging poses and views, heavily occlusions, various appearances and low-resolutions. Details on the annotated classes and examples of our annotations are available at this link http://sysu-hcp.net/lip. The challenge is conjunction with CVPR 2018, Salt Lake City. Challenge participants with the most successful and innovative entries will be invited to present on this workshop.

Regarding the viability of this workshop, the topic of this workshop is attractive and active. It is very possible that many active researchers would like to attend this workshop (actually the expected number of attendees is 100 from a conservative estimation based on the past publication record on related topics). It is related to yet still clearly different from past workshops as explained below. In addition, we have got confirmation from many renowned professors and researchers in this area and they are either glad to give a keynote speech (as listed in the program) or kindly offer help. We believe this workshop will be a very successful one and it will indeed benefit the progress of this research area significantly.

Topics of interest

The submission are expected to deal with human-centric visual perception and processing tasks which include but are not limited to:

Multi-person parsing and pose estimation
2D/3D human pose estimations from the single RGB/Depth images or videos
Pedestrian detection in the wild scenarios
Human action recognition and trajectory recognition/prediction
Human re-identification in crowd videos and cross-view cameras
3D human body shape estimation and simulation
Human clothing and attribute Recognition
Person re-identification, face recognition/verification in surveillance videos
Novel datasets for performance evaluation and/or empirical analyses of existing methods
Advanced applications of human understanding, including autonomous cars, event recognition and prediction, robotic manipulation, indoor navigation, image/video retrieval and virtual reality.

Tentative SCHEDULE

Time	Schedule
Location:	Room 250 D-E
08:30-08:40	Opening remarks and welcome
08:40-09:00	The Look Into Person (LIP) challenge introduction and results
09:00-09:15	Oral talk 1: The second place of single-person pose estimation challenge(Track 3), Speaker: Zhenqi Xu(ByteDance AI Lab)
09:15-10:00	Invited talk 1: Xian-Sheng Hua, Distinguished Engineer/VP, Alibaba Group
10:00-10:30	Poster session and coffee break(Hall A; Halls 1-4)
10:30-11:15	Invited talk 2: Visual Commonsense Reasoning, Speaker: Yixin Zhu, Postdoctoral Scholar, VCLA lab at UCLA
11:15-11:30	Oral talk 2: Winner of single-person(Track 3) & multi-person(Track 4) pose estimation challenge, Speaker: Wu Liu(JD AI Research)
11.30-11.45	Oral talk 3: Winner of single-person(Track 1) & multi-person(Track 2 & Track 5) human parsing challenge, Speaker: Yunchao Wei(University of Illinois Urbana-Champaign)
11:45-14:00	Lunch(Hall A; Halls 1-4)
14:00-14:30	Invited talk 3: Jimei Yang, Adobe Research
14:30-14:45	Oral talk 4: The second place of Multi-Human pose estimation challenge(Track 4), Speaker: Sheng Jin(Tsinghua University), Wentao Liu(Tsinghua University)
14:45-15:15	Poster session and coffee break(Hall A; Halls 1-4)
15:15-15:45	Invited talk 4: Jia Deng, University of Michigan
15:45-16:15	Awards & Future Plans

Submission

Paper Submission
Important Dates
Paper Submission Due Date (mm/dd/yyyy): 04/15/2018 23:59 UTC/GMT+0
Notification of Acceptance/Rejection (mm/dd/yyyy): 04/18/2018
Camera-Ready Due Date (mm/dd/yyyy): 04/20/2018
Conference Date (mm/dd/yyyy): 06/18/2018
Format Requirements
Format: Papers are limited to 8 pages, including figures and tables, in the CVPR style. Additional pages containing only cited references are allowed.
Example submission paper with detailed instructions: http://cvpr2018.thecvf.com/files/egpaper_for_review.pdf
LaTeX/Word Templates(tar): http://cvpr2018.thecvf.com/files/cvpr2018AuthorKit.tgz
LaTeX/Word Templates(zip): http://cvpr2018.thecvf.com/files/cvpr2018AuthorKit.zip
A complete paper should be submitted using the above templates, which are blind-submission review-formatted templates. The length should match that intended for final publication. Papers with more than 8 pages (excluding references) will be rejected without review.
Submission Details
Paper Submission Site: https://cmt3.research.microsoft.com/VUHCS2018
Conference City: Salt Lake City, UTAH
Conference Country: United States of America

Challenge Submission
Important Dates
Challenge Submission Due Date (mm/dd/yyyy): 06/10/2018 23:59 UTC/GMT+0

Introduction

Topics of interest

The submission are expected to deal with human-centric visual perception and processing tasks which include but are not limited to:

Tentative SCHEDULE

Time

Schedule

Submission

Track1

Track2

Track3

Track4

Track5

Main Organizers

Xiaodan Liang
xiaodan1@cs.cmu.edu

Jian Zhao
zhaojian90@u.nus.edu

Liang Lin
linliang@ieee.org

Jiashi Feng
elefjia@nus.edu.sg

Eric Xing
epxing@cs.cmu.edu

Ke Gong
gongk3@mail2.sysu.edu.cn

Jianshu Li
jianshu@u.nus.edu

Yicheng Li
liych28@mail2.sysu.edu.cn

Contact

Time

Schedule

Xiaodan Liang xiaodan1@cs.cmu.edu

Jian Zhao zhaojian90@u.nus.edu

Liang Lin linliang@ieee.org

Jiashi Feng elefjia@nus.edu.sg

Eric Xing epxing@cs.cmu.edu

Ke Gong gongk3@mail2.sysu.edu.cn

Jianshu Li jianshu@u.nus.edu

Yicheng Li liych28@mail2.sysu.edu.cn