CVPR 2019 Workshop On Augmented Human: Human-centric Understanding and 2D/3D Synthesis, and the third Look Into Person (LIP) Challenge

Introduction

One of the ultimate goals of computer vision techniques is to augment human in a variety of application fields. Developing solutions to comprehensive human-centric visual applications in the wild scenarios, regarded as one of the most fundamental problems in computer vision, could have a crucial impact in many industrial application domains, such as virtual reality, human-computer interaction, human motion analysis,and advanced robotic perceptions. Human-centric understanding including human parsing/detection, pose estimation, relationship detection are often regarded as the very first step for higher-level activity/event recognition and detection. Nonetheless, a large gap seems to exist between what is needed by the real-life applications and what is achievable based on modern computer vision techniques. By taking a further step, more virtual reality and 3D graphic analysis research advances are urgently expected for advanced human-centric analysis. For example, the 2D/3D clothes virtual try-on simulation system that seamlessly fits various clothes into 3D human body shape has attracted numerous commercial interests. The human motion synthesis and prediction can bridge the virtual and real worlds, such as, simulating virtual characters to mimic the human behaviors, empowering robotics more intelligent interactions with human by enabling causal inferences for human activities. The goal of this workshop is to allow researchers from the fields of human-centric understanding and 2D/3D synthesis to present their progress, communication and co-develop novel ideas that potentially shape the future of this area and further advance the performance and applicability of correspondingly built systems in real-world conditions.

We will also organize the third large-scale Look Into Person (LIP) challenges which include five competition tasks: the single-person human parsing, the single-person pose estimation, the multi-person human parsing, multi-person video parsing, multi-person pose estimation benchmark, and clothes virtual try-on benchmark. This third LIP challenge mainly extends the second LIP challenge in CVPR 2017 and CVPR 2018 by additionally covering a video human parsing challenge and the 2D/3D clothes virtual try-on benchmark. For the single-person human parsing and pose estimation, we will provide 50,000 images with elaborated pixel-wise annotations with comprehensive 19 semantic human part labels and 2D human poses with 16 dense key points. For the multi-person human parsing competition task, we will also provide another 50000 images of crowded scenes with 19 semantic human part labels. For video-based human parsing, 3000 video shots with 1-2 minutes will be densely annotated with 19 semantic human part labels. For multi-person pose estimation, the dataset contains 25,828 images (ave. 3 persons/image) with 2D human poses with 16 dense key points (each key point has a flag indicating whether it is visible-0/occluded-1/out of image-2) and head & instance bounding boxes . Our new image-based clothes try-on benchmark targets at fitting new in-shop clothes into a person image and generate one try-on video to show different clothes viewpoints on the person. The benchmark will contain around 25,000 front-view pictures and top clothing image pairs for training and 3000 clothes-person pairs for testing. In terms of the quality of image-based virtual try-on, the quantitative performance will be given via a human subjective perceptual study. In terms of the quality of video-based virtual try-on, the benchmark will be evaluated via AMT human evaluation. The images collected from the real-world scenarios contain humans appearing with challenging poses and views, heavily occlusions, various appearances and low-resolutions. Details on the annotated classes and examples of our annotations are available at this link https://vuhcs.github.io/ . This challenge will be released before January, 2019 to enable participants to evaluate their techniques. The challenge is conjunction with CVPR 2019, Long Beach, CA. Challenge participants with the most successful and innovative entries will be invited to present on this workshop.

Regarding the viability of this workshop, the topic of this workshop is attractive and active. It is very possible that many active researchers would like to attend this workshop (actually the expected number of attendees is 100 from a conservative estimation based on the past publication record on related topics). It is related to yet still clearly different from past workshops as explained below. In addition, we have got confirmation from many renowned professors and researchers in this area and they are either glad to give a keynote speech (as listed in the program) or kindly offer help. We believe this workshop will be a very successful one and it will indeed benefit the progress of this research area significantly.

Accepted Papers

Multi-scale Aggregation R-CNN for 2D Multi-person Pose Estimation. Gyeongsik Moon (Seoul National University)*; Ju Yong Chang (Kwangwoon University); Kyoung Mu Lee (Seoul National University)
Skepxels: Spatio-temporal Image Representation of Human Skeleton Joints for Action Recognition. Jian Liu (The University of Western Australia)*; Naveed Akhtar (The University of Western Australia); Ajmal Mian (University of Western Australia)
Exploiting Offset-guided Network for Pose Estimation and Tracking. Rui Zhang (Beijing University of Posts and Telecommunications); Zheng Zhu (Institute of Automation, Chinese Academy of Sciences)*; Peng Li (Horizon robotic); Rui Wu (Horizon Robotics); Chaoxu Guo (Institue of Automation, Chinese Academy of Science); Guan Huang (Horizon Robotics); Hailun Xia (Beijing University of Posts and Telecommunications)
On the Robustness of Human Pose Estimation. Naman Jain (Indian Institute Of Technology Bombay)*; Sahil H Shah (Indian Institute of Technology Bombay); Abhishek Sharma (Gobasco AI Labs); Arjun Jain (Indian Institute Of Technology Bombay)
Infant Contact-less Non-Nutritive Sucking Pattern Quantification via Facial Gesture Analysis. Xioafei Huang (Northeastern University); Alaina Martens (Northeastern University); Emily Zimmerman (Northeastern University); Sarah Ostadabbas (Northeastern University)*
Unpaired Pose Guided Human Image Generation. Xu Chen (ETH Zürich)*; Jie Song (ETH Zurich); Otmar Hilliges (ETH Zurich)
What Elements are Essential to Recognize Human Actions? Yachun Li (Zhejiang University)*; Yong Liu (Zhejiang University); Chi Zhang (Megvii Inc.)
Patch-based 3D Human Pose Refinement. Qingfu Wan (Fudan University)*; Weichao Qiu (Johns Hopkins University); Alan Yuille (Johns Hopkins University)
Towards Real-time Sign Language Interpreting Robot: Evaluation of Non-manual Components on Recognition Accuracy. Arman Sabyrov (Nazarbayev University); Medet Mukushev (Nazarbayev University); Alfarabi Imashev (Nazarbayev University); Kenessary Koishybay (Nazarbayev University); Anara Sandygulova (Nazarbayev University)*; Vadim Kimmelman (University of Bergen)

Challenge Winners

Track 1: Single-Person Human Parsing Challenge Winners:
1st:
Peike Li1, 2, Yunqiu Xu1, Yi Yang1, 2
1Baidu Research, 2CAI, University of Technology Sydney

2nd:
Dongdong Yu1, *, Kai Su1, 2, *, Jian Wang1, Kaihui Zhou1 , Xin Geng2 , Changhu Wang1
1ByteDance AI Lab, 2Southeast University

3rd:
Zhijie Zhang1, Wenguan Wang2, Jianbing Shen2, Siyuan Qi3, Yanwei Pang1 , Ling Shao2
1Tianjin University, 2Inception Institute of Artificial Intelligence,3UCLA
Track 2: Single-Person Human Pose Estimation Challenge Winners:
1st:
Kai Su1, 2, *, Dongdong Yu1, *, Xin Geng2 , Changhu Wang1
1ByteDance AI Lab, 2Southeast University

1st:
Bin Xiao, Yifan Lu, Tang Tang, Hao Zhu, Linfu Wen
ByteDance AI Lab

3rd:
Juan Manuel Pérez Rúa, Kaiyang Zhou, Adrian Bualt, Xiatian Zhu, Tao Xiang, Maja Pantic
Samsung AI Center - Cambridge, UK

3rd:
Hong Hu, Feng Zhang, Hanbin Dai, Huan Luo, LiangBo Zhou, Mao Ye
University of Electronic Science and Technology of China(UESTC)
Track 3: Multi-Person Human Parsing Challenge Winners:
1st:
Yunqiu Xu1, Peike Li1, 2, Yi Yang1, 2
1Baidu Research, 2CAI, University of Technology Sydney

2nd:
Meng Zhang1, Xinchen Liu2, Wu Liu2, Anfu Zhou1, Huadong Ma1, Tao Mei2
1Beijing University of Posts and Telecommunications,
2AI Research of JD.com

3rd:
Bingke Zhu1, 2, Xiaomei Zhan1, 2, Yingying Chen1, 2, Ming Tang1, 2,
Hui Li3, Ting Zhang3, Zhaoliang Zhang3, Wenjie Tang3, Jinqiao Wang1, 2
1National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences,
2University of Chinese Academy of Sciences,
3R&D Center, China National Electronics Import & Export Corporation
Track 4: Video Multi-Person Human Parsing Challenge Winners:
1st:
Peike Li1, 2, Yunqiu Xu1, Yi Yang1, 2
1Baidu Research, 2CAI, University of Technology Sydney

2nd:
Jianhua Sun1, *, Dian Shao2, *, Hao-Shu Fang1, Cewu Lu1
1Shanghai Jiao Tong University, 2The Chinese University of Hong Kong
Track 5: Image-based Multi-pose Virtual Try-on Challenge Winners:
1st:
Rokkyu Lee, Hyugjae Lee, Minseok Kang, Gunhan Park
NHN

2nd:
Yu Sun, Wu Liu, Qian Bao, Yuhao Cheng, Tao Mei
JD AI Human

Tentative SCHEDULE

Time	Schedule
Location: 102A	Date: Sunday, 16 Jun 2019 from 8:30AM to 17:15PM
08:30-08:40	Opening remarks and welcome
08:40-09:00	The Look Into Person (LIP) challenge introduction and results
09:00-09:45	Oral talk 1: Winner of single-person / multi-person / video human parsing challenge
09:45-10:00	Oral talk 2: Winner of pose estimation challenge
10:00-10:30	Poster session and coffee break
10:30-11:00	Invited talk 1: Shiry Ginosar, PHD, UC Berkeley
11:00-11:30	Invited talk 2 : Michael Black, Professor, Max Planck Institute
11:30-12:00	Oral talk 3: Winner of pose estimation and 2nd place of single person parsing
12:00-13:30	Lunch
13:30-14:00	Invited talk 3: Alex Schwing, Assistant Professor, UIUC
14:00-14:30	Invited talk 4: Jianchao Yang, Director, ByteDance AI Lab.
14:30-14:45	Oral talk 4: Winner of image-based multi-pose virtual try-on challenge
14:45-16:15	Poster session and coffee break
16:15-16:45	Invited talk 5: Katerina Fragkiadaki, Assistant Professor, CMU
16:45-17:15	Awards & Future Plans

Call for Papers

Paper Submission
Important Dates
Paper Submission Due Date: May 1, 2019 [11:59 p.m. PST]
Notification of Acceptance/Rejection: May 7, 2019 [11:59 p.m. PST]
Camera-Ready Due Date: May 14, 2019 [11:59 p.m. PST]
Format Requirements
Format: Papers that are at most 4 pages including references do not count as a dual submission. Workshop papers that are reviewed and longer than 4 pages do count as a publication, including figures and tables, in the CVPR style.
Example submission paper with detailed instructions: http://cvpr2019.thecvf.com/files/egpaper_for_review.pdf
LaTeX/Word Templates(tar): http://cvpr2019.thecvf.com/files/cvpr2019AuthorKit.tgz
LaTeX/Word Templates(zip): http://cvpr2019.thecvf.com/files/cvpr2019AuthorKit.zip
A complete paper should be submitted using the above templates, which are blind-submission review-formatted templates. The length should match that intended for final publication.
Submission Details
Paper Submission Site: https://cmt3.research.microsoft.com/VUHCS2019
Conference City: Long Beach, CA
Conference Country: United States of America

Introduction

Accepted Papers

Challenge Winners

Posters

Topics of interest

The submission are expected to deal with human-centric visual perception and processing tasks which include but are not limited to:

Tentative SCHEDULE

Time

Schedule

Call for Papers

Challenges

Track1

Track2

Track3

Track4

Track5

Main Organizers

Xiaodan Liang
xdliang328@gmail.com

Haoye Dong
donghy7@mail2.sysu.edu.cn

Yunchao Wei
wychao1987@gmail.com

Xiaohui Shen
shenxiaohui@bytedance.com

Jiashi Feng
elefjia@nus.edu.sg

Song-Chun Zhu
sczhu@stat.ucla.edu

Contact

Challenge Submission
Important Dates
Challenge Submission Due Date:~~May 15, 2019, 11:59 PM GMT~~ Extend to June 2, 2019, 11:59 PM GMT

Time

Schedule

Xiaodan Liang xdliang328@gmail.com

Haoye Dong donghy7@mail2.sysu.edu.cn

Yunchao Wei wychao1987@gmail.com

Xiaohui Shen shenxiaohui@bytedance.com

Jiashi Feng elefjia@nus.edu.sg

Song-Chun Zhu sczhu@stat.ucla.edu