基于声学特征和音乐特征的音乐流派分类研究
发布时间:2018-04-21 07:28
本文选题:音乐流派分类 + 声学特征 ; 参考:《江南大学》2014年硕士论文
【摘要】:音乐流派自动分类是利用信号处理和模式识别等方法对数字音乐样本按照流派风格通过计算机达到类别自动识别的过程,音乐自动检索与流派分类成为了近几年研究的热点。音乐样本时间长且复杂多变,由不同人声、乐器的多路混合信号组成,因此音乐流派分类是一项困难的模式识别问题,同时具有很大的研究和应用价值。本文基于声学特征和音乐特征,对流派分类识别特征及其提取方法进行了研究,具体的研究工作如下: 1、研究了音乐节拍的提取方法,提出结合节拍语义特征和MFCC声学特征的音乐流派分类方法。由于音乐节拍的强度、快慢、持续时间等反映了音乐不同流派风格的重要语义特征,而音乐节拍多属于由打击乐器所产生的低频部分,因此利用小波变换对音乐信号进行6层分解提取低频节拍特征;针对节拍特征差异不明显的音乐流派,用描述频域能量包络的MFCC声学特征与节拍特征结合,并基于音乐流派机理分析用8阶MFCC代替常用的12阶MFCC。对8类音乐流派实验仿真结果表明,基于语义特征和声学特征结合的方法,总体分类准确率可达68.4%,同时特征维数增加对分类时间影响很小。 2、研究了基于谱图分离的调制谱特征的音乐流派分类方法。通过分析形成音乐节奏的冲击成分和形成韵律的和声成分在音乐信号中的时频特性,发现直接从音乐信号中提取特征会受到这两种成分相互影响。利用节奏与和声在时频平面具有不同规律的特点,对音乐信号的谱图滤波,分离出音乐中的打击成分与和声成分;对打击与和声谱图分别进行小波调制,得到表现音乐节奏和韵律谱规律的调制谱特征,它表达了音乐流派特点的长时中级特征。仿真实验结果表明:分离后的打击与和声成分谱图更清晰地表征了音乐节奏和韵律的特点和规律,对8类音乐流派提取打击与和声调制谱特征,经LDA降维后利用SVM分类,分类准确率达到了73.5%。 3、研究了基于多尺度Gabor图像纹理特征的音乐流派分类方法。由于一般音乐流派分类系统多基于声学特征,从多路混合的声音信号中提取声学特征,会因为各音乐元素相互间的影响而降低声学特征的分类性能,同时音乐语义元素在时频谱图上呈现出清晰的视觉纹理信息,谱图纹理的疏密、方向间接反映了音乐节奏、韵律等流派特点;因此从图像处理角度通过提取谱图图像多尺度、多方向的二维Gabor纹理特征来获得音乐信号不同角度的时频特征,对8类音乐流派实验仿真结果表明,,基于多尺度Gabor图像纹理特征的分类效果与声学特征相当,总体分类准确率为73.1%,最高可达83.3%。
[Abstract]:The automatic classification of music schools is a process of automatic classification of digital music samples according to genre style by means of signal processing and pattern recognition. Automatic music retrieval and genre classification has become a hot topic in recent years. The music sample is long and complex, and is composed of mixed signals of different voices and instruments. Therefore, the classification of music schools is a difficult problem of pattern recognition, and has great research and application value. Based on acoustic features and music features, this paper studies the genre classification recognition features and their extraction methods. The specific research work is as follows: 1. The extraction method of music beat is studied, and a method of music genre classification combining the semantic feature of rhythm and MFCC acoustic feature is proposed. Because the intensity, speed and duration of the musical beat reflect the important semantic features of different genres of music, the musical beat mostly belongs to the low-frequency part produced by the percussion instrument. Therefore, the wavelet transform is used to decompose the music signal into six layers to extract the low frequency rhythm feature. For the music genre with no obvious difference in the rhythm characteristic, the MFCC acoustic feature which describes the energy envelope in the frequency domain is combined with the beat feature. Based on the analysis of music genre mechanism, 8 order MFCC is used to replace 12 order MFCC. The simulation results of 8 kinds of music schools show that, based on the combination of semantic features and acoustic features, the overall classification accuracy can reach 68.4, and the increase of feature dimension has little effect on the classification time. 2. The method of music genre classification based on spectral separation is studied. Based on the analysis of the time-frequency characteristics of the impact components forming the musical rhythm and the harmonic components forming the rhythm in the music signal, it is found that the extraction of the characteristics from the music signal directly will be affected by the interaction of the two components. Based on the different characteristics of rhythm and harmony in time-frequency plane, the music signal spectrum is filtered to separate the striking and harmonic components, and the beat and harmonic spectrum are modulated by wavelet, respectively. The modulation spectrum features which express the rhythm and rhythm of music are obtained, which express the long-term and intermediate characteristics of the music genre. The simulation results show that the characteristics and rules of music rhythm and prosody are more clearly represented by the separated beat and harmony component spectrum, and the characteristics of striking and harmonic modulation spectrum are extracted from 8 music schools, and then classified by SVM after dimension reduction by LDA. The classification accuracy reached 73.5%. 3. The music genre classification method based on multi-scale Gabor image texture feature is studied. Because the general music genre classification system is mostly based on acoustic features, extracting acoustic features from multi-channel mixed sound signals will reduce the classification performance of acoustic features because of the influence of each musical element on each other. At the same time, the semantic elements of music show clear visual texture information on the time-frequency spectrum, and the texture of the spectrum is dense, the direction indirectly reflects the characteristics of music rhythm, rhythm and other genres, so from the perspective of image processing, the multi-scale image of the spectrum image is extracted from the point of view of image processing. Multi-directional two-dimensional Gabor texture features are used to obtain the time-frequency features of different angles of music signals. The simulation results of 8 kinds of music schools show that the classification effect based on multi-scale Gabor image texture features is similar to that of acoustic features. The total classification accuracy was 73.1 and the highest was 83.3.
【学位授予单位】:江南大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:J61;TP391.41
【参考文献】
相关期刊论文 前2条
1 马光志,秦丹;利用互信息实现音乐风格的分类[J];计算机应用;2005年05期
2 王晓慧;;线性判别分析与主成分分析及其相关研究评述[J];中山大学研究生学刊(自然科学、医学版);2007年04期
本文编号:1781472
本文链接:https://www.wllwen.com/wenyilunwen/qiyueyz/1781472.html