Open Access Open Access  Restricted Access Subscription or Fee Access

Learning and Matching of Dynamic & Robust Visual Tracking Recognition

Ramesh Mande, A Sudhir Babu

Abstract


We learn explicit representations for dynamic shape manifolds of moving humans for the task of action recognition. We exploit locality preserving projections (LPP) for dimensionality reduction, leading to a low-dimensional embedding of human movements. Given a sequence of moving silhouettes associated to an action video, by LPP, we project them into a low-dimensional space to characterize the spatiotemporal property of the action, as well as to preserve much of the geometric structure. Action classification is then achieved in a nearest neighbor framework. The proposed method, extensive experiments have been carried out on a recent dataset including ten actions performed by nine different subjects. The experimental results show that the proposed method is able to not only recognize human actions effectively, but also considerably tolerate some challenging conditions, e.g. partial occlusion, low-quality videos, changes in viewpoints, scales, and clothes; within-class variations caused by different subjects with different physical build; styles of motion; etc.

Keywords


Action Recognition, Dimensionality Reduction, Human Motion Analysis, Locality Preserving Projections (LPP).

Full Text:

PDF

References


D. Gavrila, “The visual analysis of human movement: A survey,” Comput. Vis. Image Understand., vol. 73, no. 1, pp. 82–98, 1999.

C. Cedras and M. Shah, “Motion-based recognition: A survey,” ImageVis. Comput. vol. 13, no. 2, pp. 129–155, 1995.

L. Wang, W. Hu, and T. Tan, “Recent developments in human motionanalysis,” Pattern Recognit., vol. 36, no. 3, pp. 585–601, 2003.

M. Black, “Explaining optical flow events with parameterized spatiotemporal models,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1999, vol. 1, pp. 1326–1332.

A. Efros, A. Berg, G. Mori, and J. Malik, “Recognizing action at a distance,” in Proc. Int. Conf. Computer Vision, 2003, vol. 2, pp. 726–733.

R. Polana and R. Nelson, “Detection and recognition of periodic, nonrigid motion,” Int. J. Comput. Vis., vol. 23, no. 3, pp. 261–282, 1997.

X. Feng and P. Perona, “Human action recognition by sequence of movelet codewords,” in Proc. Int. Symp. 3D Data Processing Visualization and Transmission, 2002, pp. 717–723.

Y. Sheikh and M. Shah, “Exploring the space of an action for human action recognition,” in Proc. Int. Conf. Computer Vision, 2005, vol. 1, pp. 144–149.

R. Green and L. Guan, “Quantifying and recognizing human movement patterns from monocular video images,” IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 2, pp. 179–190, Feb. 2004.

C. Bregler, “Learning and recognizing human dynamics in video sequences,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1997, pp. 568–574.

A. Bissacco, A. Chiuso, Y. Ma, and S. Soatto, “Recognition of human gaits,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2001, vol. 2, pp. 52–57.

A. Ali and J. Aggarwal, “Segmentation and recognition of continuous human activity,” in Proc. Int. Workshop on Detection and Recognition of Events in Video, 2001, pp. 28–35.

J. Ben-Arie, Z. Wang, P. Pandit, and S. Rajaram, “Human activity recognition using multidimensional indexing,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 8, pp. 1091–1104, Aug. 2002.

C. Rao and M. Shah, “View-invariance in action recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2001, vol. 2, pp. 316–321.

A. Yilmaz and M. Shah, “Recognizing human actions in videos acquired by uncalibrated moving cameras,” in Proc. Int. Conf. Computer Vision, 2005, vol. 1, pp. 150–157.

V. Parameswaran and R. Chellappa, “View invariants for human action recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003, vol. 2, pp. 610–613.

A.Yilmaz and M. Shah, “Action sketch:Anovel action representation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005, vol. 1, pp. 984–989.

Y. Yacoob and M. Black, “Parameterized modeling and recognition of activities,” Comput. Vis. Image Understand., vol. 73, no. 2, pp. 232–247, 1999.

P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie, “Behavior recognition via sparse spatio-temporal features,” presented at the Int. Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, 2005.

Y. Song, L. Goncalves, and P. Perona, “Unsupervised learning of human motion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 7, pp. 814–827, Jul. 2003.

L. Zelnik-Manor and M. Irani, “Event-based analysis of video,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2001, vol. 2, pp. 123–130.

C. Schuldt, I. Laptev, and B. Caputo, “Recognizing human actions: A local SVM approach,” in Proc. Int. Conf. Pattern Recognition, 2004, vol. 3, pp. 32–36.

V. Kellokumpu, M. Pietikainen, and J. Heikkila, “Human activity recognition using sequences of postures,” presented at the IAPR Conf. Machine Vision Applications, 2005.

A. Bobick and J. Davis, “The recognition of human movement using temporal templates,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 3, pp. 257–267, Mar. 2001.

S. Carlsson and J. Sullivan, “Action recognition by shape matching to key frames,” presented at the Int. Workshop on Models Versus Exemplars in Computer Vision, 2001.

A. Veeraraghavan, A. Roy-Chowdhury, and R. Chellappa, “Role of shape and kinematics in human movement analysis,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004, vol. 1, pp. 730–737.

M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, “Action as space-time shapes,” in Proc. Int. Conf. Computer Vision, 2005, vol. 2, pp. 1395–1402.

C. Sminchisescu, A. Kanaujia, Z. Li, and D. Metaxas, “Conditional models for contextual human motion recognition,” in Proc. Int. Conf. Computer Vision, 2005, vol. 2, pp. 1808–1815.

D. Weinland, R. Ronfard, and E. Boyer, “Motion history volumes for free viewpoint action recognition,” presented at the IEEE Workshop Modeling People and Human Interaction, 2005.

J. Yamato, J. Ohya, and K. Ishii, “Recognizing human action in time sequential images using hidden Markov model,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1992, pp. 379–385.

G. Johansson, “Visual perception of biological motion and a model for its analysis,” Percept. Psychophys., vol. 14, pp. 201–211, 1973.

O. Masoud and N. Papanikolopoulos, “Recognizing human activities,” in Proc. Int. Conf. Advanced Video and Signal Based Surveillance, 2003, pp. 157–162.

C. Fanti, L. Zelnik-Manor, and P. Perona, “Hybrid models for human motion recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005, vol. 1, pp. 1166–1173.

I. T. Jolliffe, Principal Component Analysis. New York: Springer- Verlag, 1986.

P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 711–720, Jul. 1997.

S. Roweis and L. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science, vol. 290, pp. 2323–2326, 2000.

J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science, vol. 290, pp. 2319–2323, 2000.

M. Belkin and P. Niyogi, “Laplacian eigenmaps and spectral techniques for embedding and clustering,” in Proc. Int. Conf. Advances in Neural Information Processing Systems, 2001, pp. 585–591.

X. He and P. Niyogi, “Locality preserving projections,” presented at the Int. Conf. Advances in Neural Information Processing Systems, 2003.

A. Elgammal and C.-S. Lee, “Inferring 3D body pose from silhouettes using activity manifold learning,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004, vol. 2, pp. 681–688.

Q.Wang, G. Xu, and H. Ai, “Learning object intrinsic structure for robust visual tracking,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003, vol. 2, pp. 227–233.

C. Sminchisescu and A. Jepson, “Generative modeling for continuous non-linearly embedded visual inference,” in Proc. Int. Conf. Machine Learning, 2004, pp. 140–147.

X. He, S. Yan, Y. Hu, P. Niyogi, and H. Zhang, “Face recognition using laplacianfaces,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 3, pp. 328–340, Mar. 2005.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.