Projects

xdyang-cvpr2014

X. Yang and Y. Tian. Super Normal Vector for Activity Recognition Using Depth Sequences. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014. [PDF]

SNV is an open source MATLAB/C++ implementation of the super normal vector (SNV) for human activity recognition using depth sequences. (Please cite our paper if you use the code)


Hand Gesture Recognition

We propose a novel 3D descriptor for hand gesture recognition for depth maps captured by Kinect-style cameras: Histogram of 3D Facets (H3DF). Different from previous methods, which massively used existing 2D image descriptors designed for 2D images on RGB channels or Grayscale channel, our descriptor starts from the substantial feature of a 3D object — its surface.

We first bring up with the idea of a 3D facet as a collection of discrete 3D cloud points. Then a normal-based coding process is performed to encode each facet into a compact form. Then a concentric spatial pooling is utilized to neutralize subjective variance while keeping variance between different gestures.


Human Action Recognition Using Eigen Joints

A general event detection system is proposed and evaluated by the Sruveillance Event Detection (SED) task of TRECVID 2012 campaign. A sliding temporal window is employed as the detection unit in our system. We also investigate the spatial priors of various events by estimating spatial distributions of actions under different camera views.


Context based indoor object detection

Robust and efficient indoor object detection can help people with severe vision impairment to independently access unfamiliar indoor environments. This project is to explore new methods of indoor object detection by incorporating the context information such as signage (both text and iconic) and other visual clues.


Camera-based Text Recognition from Complex Backgrounds

The goal of the project is to develop new computer vision algorithms for camera-based text recognition from complex backgrounds and non-flat surfaces in combination with commercial off-the-shelf (COTS) optical character recognition (OCR) and Text-to-Speech (TTS) software.


Surveillance Event Detection


A general event detection system is proposed and evaluated by the Sruveillance Event Detection (SED) task of TRECVID 2012 campaign. A sliding temporal window is employed as the detection unit in our system. We also investigate the spatial priors of various events by estimating spatial distributions of actions under different camera views.


Scene Understanding

Scene understanding and meaningful object detection and tracking play an important role in many applications. Unlike most of the existing scene understanding methods only attempted to answer the question of “what” is in the image, we will attempt to answer questions of both “what” and “where” in images/videos by combining object class recognition (what) and object detection (where).


Automatic Affect Detection, Segmentation, and Recognition by Fusion of Facial Features and Body Gestures

The research of expression analysis in naturalistic environments will have significant impact across a range of theoretical and applied topics. Real-life expression analysis must handle head motion (both in-plane and out-of-plane), occlusion, lighting change, low intensity expressions, low resolution input images, and absence of a neutral face for comparison. The expression analysis should also combine multimodalities such as face and body gestures, and facial and voice.


Video Surveillance and Abnormal Event Detection

The goal of this project is to advance “intelligent video surveillance” from a narrow security focus to a comprehensive intelligence focus through the creation and use of data mining algorithms. Research efforts will focus on three challenges in video surveillance: 1) composite event detection; 2) automatic association mining and pattern discovery; and 3) privacy protection.


Human Action Recognition

Action recognition with cluttered and moving background is a challenging problem. One main difficulty lies in the fact that the motion field in an action region is contaminated by the background motions. This project focuses on develop new method to detect and recognize human actions in real-world environments.


Recognizing and Matching Clothes with Complex Colors and Patterns

Recognizing and matching clothes with complex colors and patterns is a challenging task for blind people. The project focuses on developing effective and efficient algorithms and a prototype system to assist them by using a camera which is connected to a computer to perform pattern and color matching and recognition. The algorithms are robust to variations of illumination, clothing rotation and wrinkling, large intra-class of clothes patterns, and complex colors. Audio feedback of recognizing and matching results for both color and patterns is provided to users.