Acoustic Scene Classification with Modulation Spectrogram Features and a Convolutional Recurrent Network

Publish Year: 1400
نوع سند: مقاله کنفرانسی
زبان: English
View: 215

This Paper With 7 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

این Paper در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

ISAV11_037

تاریخ نمایه سازی: 20 بهمن 1400

Abstract:

One of the major objectives of artificial intelligent systems is making the machine aware of the environment. Acoustic scene classification (ASC) aims to detect the auditory scene of the recorded sound. In this paper, we propose a novel feature extraction approach based on evaluating the modulation spectrogram features instead of the commonly used Mel spectrogram. Modulation spectrogram provides more discriminant features for classification. We split the recording into several temporal segments and compute the modulation spectrogram for each segment individually. The obtained feature tensors then construct the input data of a Convolutional Long Short Term Memory (Conv-LSTM) model for classification. Using LSTM, we can capture constructive temporal information used for classification. The spectral structure of the audio signal is effectively extracted by convolutional layers. The proposed model outperforms the state of the art methods in terms of the prediction accuracy for evaluation data in ASC on the DCASE ۲۰۱۷ dataset.

Keywords:

Acoustic scene classification , Convolutional Neural Network (CNN) , Long Short Term Memory (LSTM) , Conv-LSTM , modulation spectrogram.

Authors

Sayeh Mirzaei

School of Engineering Science, College of Engineering, University of Tehran, Tehran, Iran.

Saeedeh Davoudi

School of Engineering Science, College of Engineering, University of Tehran, Tehran, Iran