CALL FOR PAPERS : DEC-2018

Submission Last Date	:	30-Dec-2018
Acceptance Notification	:	in 15 days
Publication Date	:	in 5 days

Submit Manuscript Online

FOR AUTHORS

FOR REVIEWERS

IJRET^® PUBLICATIONS

DOWNLOADS

CONTACT US

NEWS & UPDATES

Call for Paper Vol-7 Iss-02 Feb-2018

IJRET invites papers from various engineering disciplines for Volume-07 Issue-02, Feb-2018.

Submit Manuscript

Published Vol-07 Iss-01 Jan-18

IJRET Volume-07 Issue-01, Jan-2018 is published now.

Browse Papers

ENVIRONMENTAL SOUND RECOGNITION USING SPECTROGRAM IMAGE FEATURES

Amogh Hiremath

Abstract: Most of the prior research which has been carried out on audio recognition has been done in speech and music. Only in recent years, dozens of emerging works have been conducted on Environmental Sound Recognition and has gained importance. For the purpose of audio classification, many previous efforts utilize acoustic features such as Mel-frequency Cepstral Coefficients (MFCCs), Zero Crossing Rate (ZCR), Root Mean Square Error (RMSE), spectral centroid, spectral bandwidth and other frequency domain features derived from the spectrogram of the audio. In this paper, we use a slightly different approach of feature extraction, where we summarize short audio clips of about five seconds by segmenting out the most prominent part of the audio signal. We then compute spectrogram image of the segmented audio, and divide it into different sub-bands with respect to the frequency axis. For each of the sub-bands, we extract first order statistics and Gray Level Concurrence Matrix (GLCM) features. In the classification stage, we combine two SVM (Support Vector Machines) classifiers. The first classifier uses first order statistics and GLCM features. The second classifier uses acoustic features such as MFCCs, ZCR, RMSE, spectral centroid, spectral bandwidth and other frequency domain features derived from the spectrogram of the audio to obtain the final result. We evaluate our approach on two publicly available datasets, namely, ESC-10 and Freiburg-106 with a five-fold and a ten-fold cross validation for ESC-10 dataset and Freiburg-106 dataset respectively. Experiments show that the proposed approach outperforms the baselines and provides similar results compared to the state-of-art

Keywords: Environmental Sound Classification, First Order Statistics, GLCM, Spectrogram, SVM

DOI: https://doi.org/10.15623/ijret.2017.0610015