Human fall detection using segment-level CNN features and sparse dictionary learning
Paper in proceedings, 2017
This paper addresses issues in human fall detection from videos. Unlike using handcrafted features in the conventional machine learning, we extract features from Convolutional Neural Networks (CNNs) for human fall detection. Similar to many existing work using two stream inputs, we use a spatial CNN stream with raw image difference and a temporal CNN stream with optical flow as the inputs of CNN. Different from conventional two stream action recognition work, we exploit sparse representation with residual-based pooling on the CNN extracted features, for obtaining more discriminative feature codes. For characterizing the sequential information in video activity, we use the code vector from long-range dynamic feature representation by concatenating codes in segment-levels as the input to a SVM classifier. Experiments have been conducted on two public video databases for fall detection. Comparisons with six existing methods show the effectiveness of the proposed method.
sparse dictionary learning
human fall detection
automatic feature learning