Music transcription consists in transforming the musical content of audio data into a symbolic representation. The objective of this study is to investigate a transcription system for polyphonic piano. The input to this system consists in piano music recordings stored in WAV files, while the pitch of all the notes in the corresponding score forms the output. The proposed method focuses on temporal musical structures, note events and their main characteristics: the attack instant and the pitch. The aim of this work is to compare the accuracy achieved using one-event memory, short-term memory and memoryless based feature vector for classification. Signal processing techniques based on the CQT (Constant-Q Transform) are used in order to create a time-frequency representation of the input signals. Musical note classification is based on SVM (Support Vector Machines). Since large scale tests were required, the whole process (synthesis of audio data generated starting from MIDI files, comparison of the results with the original score) has been automated. Test, validation and training sets have been generated with reference to a wide number of musical pieces of heterogeneous styles.
Costantini, G., Todisco, M., Perfetti, R., Basili, R. (2009). Short-term memory and event memory classification systems for automatic polyphonic music transcription. In CSECS'09 Proceedings of the 8th WSEAS International conference on circuits, systems, electronics, control & signal processing.
Short-term memory and event memory classification systems for automatic polyphonic music transcription
COSTANTINI, GIOVANNI;BASILI, ROBERTO
2009-01-01
Abstract
Music transcription consists in transforming the musical content of audio data into a symbolic representation. The objective of this study is to investigate a transcription system for polyphonic piano. The input to this system consists in piano music recordings stored in WAV files, while the pitch of all the notes in the corresponding score forms the output. The proposed method focuses on temporal musical structures, note events and their main characteristics: the attack instant and the pitch. The aim of this work is to compare the accuracy achieved using one-event memory, short-term memory and memoryless based feature vector for classification. Signal processing techniques based on the CQT (Constant-Q Transform) are used in order to create a time-frequency representation of the input signals. Musical note classification is based on SVM (Support Vector Machines). Since large scale tests were required, the whole process (synthesis of audio data generated starting from MIDI files, comparison of the results with the original score) has been automated. Test, validation and training sets have been generated with reference to a wide number of musical pieces of heterogeneous styles.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.