imprimer la page

PARK Tae Hong / 박태홍*

Towards automatic musical instrument timbre recognition

Edition: Ph.D. Dissertation, Princeton University, 244p.

Date: 2004

Region: N/A


Type of media: Grey

Language: English

Editor: M.B.


This dissertation is comprised of two parts--focus on issues concerning research and development of an artificial system for automatic musical instrument timbre recognition and musical compositions. The technical part of the essay includes a detailed record of developed and implemented algorithms for feature extraction and pattern recognition. A review of existing literature introducing historical aspects surrounding timbre research, problems associated with a number of timbre definitions, and highlights of selected research activities that have had significant impact in this field are also included. The developed timbre recognition system follows a bottom-up, data-driven model that includes a pre-processing module, feature extraction module, and a RBF/EBF (Radial/Elliptical Basis Function) neural network-based pattern recognition module. 829 monophonic samples from 12 instruments have been chosen from the Peter Siedlaczek library (Best Service) and other samples from the Internet and personal collections. Significant emphasis has been put on feature extraction development and testing to achieve robust and consistent feature vectors that are eventually passed to the neural network module. In order to avoid a garbage-in-garbage-out (GIGO) trap and improve generality, extra care was taken in designing and testing the developed algorithms using various dynamics, different playing techniques, and a variety of pitches for each instrument with inclusion of attack and steady-state portions of a signal. Most of the research and development was conducted in Matlab. The compositional part of the essay includes brief introductions to " A d'Ess Are ," "Aboji ," "48 13 N, 16 20 O ," and "pH-SQ ." A general outline pertaining to the ideas and concepts behind the architectural designs of the pieces including formal structures, time structures, orchestration methods, and pitch structures are also presented.