Action Recognition

Project Goal

Action recognition is an active research topic in computer vision and pattern recognition, due to a wide range of potential applications, such as intelligent video surveillance, perceptual interface and content-based video retrieval. In action recognition, a key issue is to extract useful action information from raw video data. Since manifold learning methods have been successfully used in many image applications, this project followed this direction and developed a manifold learning algorithm using temporal information for action recognition.

Manifold Learning for Action Recognition

The block diagram of the action recognition framework is shown in the following figure.

  1. After preprocessing the videos, a sequence of images containing the regions of interest are obtained.
  2. Each action unit is then represented by a set of feature vectors.
  3. Each feature representation is mapped to the embedded manifold by the proposed spatio-temporal manifold learning method.
  4. Set-based classifier is employed on the action sequence, and the action label is obtained.

Figure 1: Manifold learning based action recognition framework


Figure 2: 2-D visualization of the training and testing data on Cambridge-Gesture database


Figure 3: Example neighbors detected by global constraint of temporal labels


Related Publications

  1. Andy J Ma, Pong C Yuen, Wilman W W Zou, and Jian-Huang Lai, “Supervised Spatio-Temporal Neighborhood Topology Learning for Action Recognition,” IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), vol. 23, no. 8, pp. 1447-1460, 2013. [PDF] [Project] [Code] [Data]
  2. Andy J Ma, Pong C Yuen, Wilman W W Zou, and Jian-Huang Lai, “Supervised Neighborhood Topology Learning for Human Action Recognition,” Workshops of IEEE International Conference on Computer Vision (ICCV Workshops), 2009. [PDF] [PPT]