EXPLORE PUBLICATIONS BY COUNTRIES


	EUROPE

	MIDDLE EAST

	ASIA

	AFRICA
.............................

	United States of America

	United Kingdom

	Canada

	Australia

	Italy

	France

	Brazil

	Germany

	Malaysia

	Turkey

	China

	Taiwan

	Japan

	Saudi Arabia

	Jordan

	Egypt

	United Arab Emirates

	India

	Nigeria

A Framework for Human Action Detection via Extraction of Multimodal Features

Lili Nurliyana Abdullah

Pages - 73 - 79 | Revised - 05-05-2009 | Published - 18-05-2009

Published in International Journal of Image Processing (IJIP)

Volume - 3 Issue - 2 | Publication Date - April 2009 Table of Contents

MORE INFORMATION

References | Cited By (6) | Abstracting & Indexing

KEYWORDS

audivisual, huma action detection, multimodal, hidden markov model

ABSTRACT

This work discusses the application of an Artificial Intelligence technique called data extraction and a process-based ontology in constructing experimental qualitative models for video retrieval and detection. We present a framework architecture that uses multimodality features as the knowledge representation scheme to model the behaviors of a number of human actions in the video scenes. The main focus of this paper placed on the design of two main components (model classifier and inference engine) for a tool abbreviated as VASD (Video Action Scene Detector) for retrieving and detecting human actions from video scenes. The discussion starts by presenting the workflow of the retrieving and detection process and the automated model classifier construction logic. We then move on to demonstrate how the constructed classifiers can be used with multimodality features for detecting human actions. Finally, behavioral explanation manifestation is discussed. The simulator is implemented in bilingual; Math Lab and C++ are at the backend supplying data and theories while Java handles all front-end GUI and action pattern updating. To compare the usefulness of the proposed framework, several experiments were conducted and the results were obtained by using visual features only (77.89% for precision; 72.10% for recall), audio features only (62.52% for precision; 48.93% for recall) and combined audiovisual (90.35% for precision; 90.65% for recall).

CITED BY (6)

1	Krerngkamjornkit, R. (2014). Novel robust computer vision algorithms for micro autonomous systems.

2	Krerngkamjornkit, R., & Simic, M. (2013, March). Human body detection in search and rescue operation conducted by unmanned aerial vehicles. In Advanced Materials Research (Vol. 655, pp. 1077-1085).

3	Sanchez-Riera, J. (2013). Capacités audiovisuelles en robot humanoïde NAO (Doctoral dissertation, Université de Grenoble).

4	Sanchez-Riera, J. (2013). Developing Audio-Visual capabilities of humanoid robot NAO (Doctoral dissertation, Université de Grenoble).

5	Sturm, P., Sminchisescu, C., Hlavac, V., Gelin, R., & Horaud, R. Developing Audio-Visual capabili-ties of humanoid robot NAO.

6	Sanchez-Riera, J., Alameda-Pineda, X., Wienke, J., Deleforge, A., Arias, S., Cech, J., ... & Horaud, R. (2012, November). Online multimodal speaker detection for humanoid robots. In Humanoid Robots (Humanoids), 2012 12th IEEE-RAS International Conference on (pp. 126-133). IEEE.

ABSTRACTING & INDEXING

1	Google Scholar

2	ScientificCommons

3	Academic Index

4	CiteSeerX

5	refSeek

6	iSEEK

7	Socol@r

8	ResearchGATE

9	Bielefeld Academic Search Engine (BASE)

10	Scribd

11	WorldCat

12	SlideShare

13	PDFCAST

14	PdfSR

REFERENCES

A.A. Efros, A.C. Berg, G. Mori and J. Malik, “Recognizing Action at a Distance”. Proceedings of International Conference on Computer Vision, 2003.

C. Stauffer and W.E.L. Grimson, “Learning Patterns of Activities using Real-Time Tracking”. Journal of IEEE Transactions on Pattern Analysis and Machine Intelligence, vol (22), no. (8), pp. 747 – 757, 2001.

C.P., Tan, K.S. Lim, and W.K. Lai. 2008. Multi-Dimensional Features Reduction of Consistency Subset Evaluator on Unsupervised Expectation Maximization Classifier for Imaging Surveillance Application. International Journal of Image Processing, vol. 2(1), pp. 18-26.

G. Tzanetakis and P. Cook, “Musical Genre Classification of Audio Signals”, IEEE Trans. On Speech and Audio Processing, vol. 10, no. 5, pp. 293 – 302, 2002.

J. P and P.S. Hiremath. 2008. Content Based Image Retrieval using Color Boosted Salient Points and Shape features of an image. 2008. International Journal of Image Processing, vol. 2(1), pp. 10-17.

L. Zelnik-Manor and M. Irani, “Event-based Analysis of Video”. Proceedings of IEEE Conference Computer Vision and Pattern Recognition, 2001.

S. Fischer, R. Lienhart and W. Effelsberg, “Automatic Recognition of Film Genres”, Proceedings of ACM Multimedia, pp. 295 – 304, 2003.

MANUSCRIPT AUTHORS

Dr. Lili Nurliyana Abdullah

- Malaysia

liyana@fsktm.upm.edu.my

CREATE AUTHOR ACCOUNT

LAUNCH YOUR SPECIAL ISSUE

View all special issues >>

PUBLICATION VIDEOS