Learning Document Image Features With SqueezeNet Convolutional Neural Network

M. Hassanpour; H. Malek

Learning Document Image Features With SqueezeNet Convolutional Neural Network

Publish place: International Journal of Engineering (IJE)، Vol: 33، Issue: 7

Publish Year: 1399

نوع سند: مقاله ژورنالی

زبان: English

This Paper With 7 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

این Paper در بخشهای موضوعی زیر دسته بندی شده است:

هوش مصنوعی > شبکه عصبی

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/1040940

شناسه ملی سند علمی:

JR_IJE-33-7_005

تاریخ نمایه سازی: 4 شهریور 1399

Abstract:

The classification of various document image classes is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for training, and their very large number of weights. Previous successful attempts at learning document image features have been based on training very large CNNs. SqueezeNet is a CNN architecture that achieves accuracies comparable to other state of the art CNNs while containing up to 50 times less weights, but never before experimented on document image classification tasks. In this research we have taken a novel approach towards learning these document image features by training on a very small CNN network such as SqueezeNet. We show that an ImageNet pretrained SqueezeNet achieves an accuracy of approximately 75 percent over 10 classes on the Tobacco-3482 dataset, which is comparable to other state of the art CNN. We then visualize saliency maps of the gradient of our trained SqueezeNet s output to input, which shows that the network is able to learn meaningful features that are useful for document classification. Previous works in this field have made no emphasis on visualizing the learned document features. The importance of features such as the existence of handwritten text, document titles, text alignment and tabular structures in the extracted saliency maps, proves that the network does not overfit to redundant representations of the rather small Tobacco-3482 dataset, which contains only 3482 document images over 10 classes.

Keywords:

Squeezenet , convolutional neural network , Document image classification

Authors

M. Hassanpour

Department of Computer Science Engineering, Shahid Beheshti University, Tehran, Iran

H. Malek

Department of Computer Science Engineering, Shahid Beheshti University, Tehran, Iran

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :

Vincent, N. and Ogier, J.-M., Shall deep learning be the ...
Han, S., Mao, H. and Dally, W.J., Deep compression: Compressing ...
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J. ...
Krizhevsky, A., Sutskever, I. and Hinton, G.E., Imagenet classification with ...
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. ...
Harley, A.W., Ufkes, A. and Derpanis, K.G., Evaluation of deep ...
Afzal, M.Z., Kölsch, A., Ahmed, S. and Liwicki, M., Cutting ...
He, K., Zhang, X., Ren, S. and Sun, J., Deep ...
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., ...
Simonyan, K. and Zisserman, A., Very deep convolutional networks for ...
Jaderberg, M., Simonyan, K. and Zisserman, A., Spatial transformer networks ...
Kumar, J., Ye, P. and Doermann, D., Structural similarity for ...
Kang, L., Kumar, J., Ye, P., Li, Y. and Doermann, ...
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, ...
Diligenti, M., Frasconi, P. and Gori, M., Hidden tree markov ...
Tensmeyer, C. and Martinez, T., Confirm–clustering of noisy form images ...
Kingma, D.P. and Ba, J., Adam: A method for stochastic ...
Simonyan, K., Vedaldi, A. and Zisserman, A., Deep inside convolutional ...
He, S. and Schomaker, L., Deepotsu: Document enhancement and binarization ...
Guo, J., He, C. and Wang, Y., Fourth order indirect ...
Oord, A.v.d., Li, Y. and Vinyals, O., Representation learning with ...

نمایش کامل مراجع