A Hybrid Deep Learning Architecture Using 3D CNNs and GRUs for Human Action Recognition

M. Savadi Hosseini; F. Ghaderi

A Hybrid Deep Learning Architecture Using 3D CNNs and GRUs for Human Action Recognition

Publish place: International Journal of Engineering (IJE)، Vol: 33، Issue: 5

Publish Year: 1399

Type: Journal paper

Language: English

This Paper With 7 Page And PDF Format Ready To Download

DOWNLOAD Paper

Certificate
I'm the author of the paper

این Paper در بخشهای موضوعی زیر دسته بندی شده است:

هوش مصنوعی > یادگیری عمیق

Export:

Link to this Paper:

https://civilica.com/doc/1021750

Document National Code:

JR_IJE-33-5_029

Index date: 14 June 2020

A Hybrid Deep Learning Architecture Using 3D CNNs and GRUs for Human Action Recognition abstract

Video contents have variations in temporal and spatial dimensions, and recognizing human actions requires considering the changes in both directions. To this end, convolutional neural networks (CNNs) and recurrent neural networks (RNNs) and their combinations have been used to tackle the video dynamics. However, a hybrid architecture usually results in a more complex model and hence a greater number of parameters to be optimized. In this study, we propose to use a stack of gated recurrent unit (GRU) layers on top of a two-stream inflated convolutional neural network. Raw frames and optical flow of the video are processed in the first and second streams, respectively. We first segment the video frames in order to be able to track the video contents in more details and by using 3D CNNs extract spatial-temporal features, called local features. We then import the sequence of local features to the GRU network, and use a weighted averaging operator to aggregate the outcome of the two processing flows, called global features. The evaluations confirm acceptable results for the two HMDB51 and UCF101 datasets. The proposed method resulted in a 1.6% improvement in the classification accuracy of the HMDB51 challenging dataset compared to the best reported results.

A Hybrid Deep Learning Architecture Using 3D CNNs and GRUs for Human Action Recognition Keywords:

Inflated convolutional neural networks , recurrent gate unit , action recognition , two-stream architecture

A Hybrid Deep Learning Architecture Using 3D CNNs and GRUs for Human Action Recognition authors

M. Savadi Hosseini

Human-Computer Interaction lab., Department of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran

F. Ghaderi

Human-Computer Interaction lab., Department of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran