{"id":14,"date":"2016-11-11T00:37:37","date_gmt":"2016-11-11T00:37:37","guid":{"rendered":"http:\/\/134.74.17.190\/wordpress\/?page_id=14"},"modified":"2016-11-11T00:38:25","modified_gmt":"2016-11-11T00:38:25","slug":"projects","status":"publish","type":"page","link":"https:\/\/media-lab.ccny.cuny.edu\/?page_id=14","title":{"rendered":"Projects"},"content":{"rendered":"<div style=\"text-align: justify;\">\n<hr \/>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><a href=\"http:\/\/yangxd.org\/publications\/\"><strong><span style=\"font-size: 14pt;\"><strong>Super Normal Vector<\/strong><\/span><\/strong><\/a><\/span><\/p>\n<\/div>\n<div style=\"text-align: justify;\"><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/div>\n<div><span style=\"font-family: 'book antiqua', palatino, serif;\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-548 aligncenter\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/wp-content\/uploads\/2014\/05\/xdyang-cvpr2014.png\" alt=\"xdyang-cvpr2014\" width=\"568\" height=\"433\" \/><\/span><\/div>\n<div style=\"text-align: justify;\"><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/div>\n<div style=\"text-align: justify;\">\n<p style=\"color: #333333;\"><span style=\"font-family: 'book antiqua', palatino, serif;\"><span style=\"font-size: 12pt;\">X. Yang and Y. Tian. <em>Super Normal Vector for Activity Recognition Using Depth Sequences<\/em>. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014. [<a style=\"color: #4183c4;\" href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/~media-server\/Publications\/CVPR2014.pdf\">PDF<\/a>] <\/span><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\"><span style=\"text-decoration: underline;\"><a href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/data\/#Code\">SNV<\/a><\/span> is an open source MATLAB\/C++ implementation of the super normal vector (SNV) for human activity recognition using depth sequences. <span style=\"color: #0000ff;\"><strong> <span style=\"color: #800080;\"><em><span style=\"font-size: 10pt;\">(Please cite our paper if you use the code)<\/span><\/em><\/span><\/strong><\/span><\/span><\/p>\n<\/div>\n<div style=\"text-align: justify;\"><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/div>\n<div style=\"text-align: justify;\"><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/div>\n<div style=\"text-align: justify;\">\n<hr \/>\n<\/div>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><span style=\"font-size: 14pt;\"><strong><a href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/~zcy\/projects_page\/FG.htm\">Hand Gesture Recognition <\/a><\/strong><\/span><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/~zcy\/projects_page\/FG\/sample.jpg\" alt=\"\" width=\"804\" height=\"249\" \/><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\">We propose a novel 3D descriptor for hand gesture recognition for depth maps captured by Kinect-style cameras: Histogram of 3D Facets (H3DF). Different from previous methods, which massively used existing 2D image descriptors designed for 2D images on RGB channels or Grayscale channel, our descriptor starts from the substantial feature of a 3D object &#8212; its surface.<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\">We first bring up with the idea of a 3D facet as a collection of discrete 3D cloud points. Then a normal-based coding process is performed to encode each facet into a compact form. Then a concentric spatial pooling is utilized to neutralize subjective variance while keeping variance between different gestures.<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<hr \/>\n<p><span style=\"font-size: 14pt; font-family: 'book antiqua', palatino, serif;\"><a href=\"http:\/\/yangxd.org\/projects\/rgbd\/\"><strong>Human Action Recognition Using Eigen Joints<\/strong><\/a><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/~yang\/xyang\/projects_page\/3D_Action_Recognition\/eigjoints.jpg\" alt=\"\" width=\"793\" height=\"435\" \/><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\">A general event detection system is proposed and evaluated by the Sruveillance Event Detection (SED) task of TRECVID 2012 campaign. A sliding temporal window is employed as the detection unit in our system. We also investigate the spatial priors of various events by estimating spatial distributions of actions under different camera views.<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><strong><span style=\"font-size: 14pt;\"><a href=\"http:\/\/yangxd.org\/projects\/assistive\/\">Context based indoor object detection<\/a><\/span><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><strong><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/images\/pro_img\/doorDec\/door_results.jpg\" alt=\"\" width=\"737\" height=\"391\" \/><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\"> Robust and efficient indoor object detection can help people with severe vision impairment to independently access unfamiliar indoor environments. This project is to explore new methods of indoor object detection by incorporating the context information such as signage (both text and iconic) and other visual clues.<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><strong><span style=\"font-size: 14pt;\"><a href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/project_textRec.jsp\">Camera-based Text Recognition from Complex Backgrounds<\/a> <\/span><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><strong><img decoding=\"async\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/images\/pro_img\/pro_text.jpg\" alt=\"\" \/><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\">The goal of the project is to develop new computer vision algorithms for camera-based text recognition from complex backgrounds and non-flat surfaces in combination with commercial off-the-shelf (COTS) optical character recognition (OCR) and Text-to-Speech (TTS) software.<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><span style=\"font-size: 14pt;\"><strong><a href=\"http:\/\/yangxd.org\/projects\/surveillance\/\">Surveillance Event Detection<\/a><\/strong><\/span> <\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><strong><a href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/~yang\/xyang\/projects_page\/Surveillance_Event_Detection\/Surveillance_Event_Detection.htm\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/~yang\/xyang\/projects_page\/Surveillance_Event_Detection\/Cover.jpg\" alt=\"\" width=\"801\" height=\"356\" \/><br \/>\n<\/a><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\">A general event detection system is proposed and evaluated by the Sruveillance Event Detection (SED) task of TRECVID 2012 campaign. A sliding temporal window is employed as the detection unit in our system. We also investigate the spatial priors of various events by estimating spatial distributions of actions under different camera views.<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<hr \/>\n<p><span style=\"font-size: 14pt; font-family: 'book antiqua', palatino, serif;\"><strong><a href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/project_sence.jsp\">Scene Understanding<\/a><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><strong><a href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/project_sence.jsp\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/images\/pro_img\/sence.png\" alt=\"\" width=\"792\" height=\"264\" \/><\/a><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\">Scene understanding and meaningful object detection and tracking play an important role in many applications. Unlike most of the existing scene understanding methods only attempted to answer the question of &#8220;what&#8221; is in the image, we will attempt to answer questions of both &#8220;what&#8221; and &#8220;where&#8221; in images\/videos by combining object class recognition (what) and object detection (where).<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><span style=\"font-size: 14pt;\"><strong><a href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/project_facial.jsp\">Automatic Affect Detection, Segmentation, and Recognition by Fusion of Facial Features and Body Gestures<\/a><\/strong><\/span><strong>                     <\/strong><strong>    <\/strong>              <\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 14pt;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/images\/pro_img\/facial.png\" alt=\"\" width=\"753\" height=\"251\" \/><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\">The research of expression analysis in naturalistic environments will have significant impact across a range of theoretical and applied topics. Real-life expression analysis must handle head motion (both in-plane and out-of-plane), occlusion, lighting change, low intensity expressions, low resolution input images, and absence of a neutral face for comparison. The expression analysis should also combine multimodalities such as face and body gestures, and facial and voice.<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><span style=\"font-size: 14pt;\"> <strong><a href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/project_serveillance.jsp\">Video Surveillance and Abnormal Event Detection<\/a><\/strong><\/span><strong><span style=\"font-size: 14pt;\">                           <\/span><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><strong><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/images\/pro_img\/surveillance.png\" alt=\"\" width=\"717\" height=\"239\" \/><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\"> The goal of this project is to advance &#8220;intelligent video surveillance&#8221; from a narrow security focus to a comprehensive intelligence focus through the creation and use of data mining algorithms. Research efforts will focus on three challenges in video surveillance: 1) composite event detection; 2) automatic association mining and pattern discovery; and 3) privacy protection.<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<hr \/>\n<p><span style=\"font-size: 14pt; font-family: 'book antiqua', palatino, serif;\"><strong><a href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/project_action.jsp\">Human Action Recognition<\/a><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><strong><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/images\/pro_img\/action.png\" alt=\"\" width=\"750\" height=\"250\" \/><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\">Action recognition with cluttered and moving background is a challenging problem. One main difficulty lies in the fact that the motion field in an action region is contaminated by the background motions. This project focuses on develop new method to detect and recognize human actions in real-world environments.<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"><span style=\"font-size: 14pt;\"><strong><a href=\"http:\/\/media-lab.engr.ccny.cuny.edu\/project_clothes.jsp\">Recognizing and Matching Clothes with Complex Colors and Patterns<\/a><\/strong><\/span><strong>                    <\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 14pt;\"><strong><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"http:\/\/media-lab.engr.ccny.cuny.edu\/images\/pro_img\/clothes.png\" alt=\"\" width=\"666\" height=\"222\" \/><\/strong><\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif; font-size: 12pt;\"> Recognizing and matching clothes with complex colors and patterns is a challenging task for blind people. The project focuses on developing effective and efficient algorithms and a prototype system to assist them by using a camera which is connected to a computer to perform pattern and color matching and recognition. The algorithms are robust to variations of illumination, clothing rotation and wrinkling, large intra-class of clothes patterns, and complex colors. Audio feedback of recognizing and matching results for both color and patterns is provided to users.<\/span><\/p>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: 'book antiqua', palatino, serif;\"> <\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Super Normal Vector X. Yang and Y. Tian. Super Normal Vector for Activity Recognition Using Depth Sequences. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014. [PDF] SNV is an open source MATLAB\/C++ implementation of the super normal vector (SNV) for human activity recognition using depth sequences. (Please cite our paper if you use [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"_links":{"self":[{"href":"https:\/\/media-lab.ccny.cuny.edu\/index.php?rest_route=\/wp\/v2\/pages\/14"}],"collection":[{"href":"https:\/\/media-lab.ccny.cuny.edu\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/media-lab.ccny.cuny.edu\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/media-lab.ccny.cuny.edu\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/media-lab.ccny.cuny.edu\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=14"}],"version-history":[{"count":1,"href":"https:\/\/media-lab.ccny.cuny.edu\/index.php?rest_route=\/wp\/v2\/pages\/14\/revisions"}],"predecessor-version":[{"id":15,"href":"https:\/\/media-lab.ccny.cuny.edu\/index.php?rest_route=\/wp\/v2\/pages\/14\/revisions\/15"}],"wp:attachment":[{"href":"https:\/\/media-lab.ccny.cuny.edu\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=14"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}