NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding
Artikel i vetenskaplig tidskrift, 2020

Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of large-scale training samples, realistic number of distinct class categories, diversity in camera views, varied environmental conditions, and variety of human subjects. In this work, we introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities. We evaluate the performance of a series of existing 3D activity analysis methods on this dataset, and show the advantage of applying deep learning methods for 3D-based human action recognition. Furthermore, we investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes. We believe the introduction of this large-scale dataset will enable the community to apply, adapt, and develop various data-hungry learning techniques for depth-based and RGB+D-based human activity understanding.

Skeleton

Lighting

Semantics

3D action recognition

video analysis

Deep learning

large-scale benchmark

Benchmark testing

Activity understanding

deep learning

Cameras

Three-dimensional displays

RGB plus D vision

Författare

Jun Liu

Nanyang Technological University

Amir Shahroudy

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik, Digitala bildsystem och bildanalys

Mauricio Perez

Nanyang Technological University

Gang Wang

Alibaba Group Holding Limited

Ling-Yu Duan

Peng Cheng Laboratory

Beijing University of Technology

Alex C. Kot

Nanyang Technological University

IEEE Transactions on Pattern Analysis and Machine Intelligence

0162-8828 (ISSN)

Vol. 42 10 2684-2701

Ämneskategorier

Psykologi (exklusive tillämpad psykologi)

Arbetsterapi

Datorseende och robotik (autonoma system)

DOI

10.1109/TPAMI.2019.2916873

PubMed

31095476

Mer information

Senast uppdaterat

2020-10-08