Improving piano music transcription by elman dynamic neural networks

IRIS

In this paper, we present two methods based on neural networks for the automatic transcription of polyphonic piano music. The input to these methods consists in live piano music acquired by a microphone, while the pitch of all the notes in the corresponding score forms the output. The aim of this work is to compare the accuracy achieved using a feed-forward neural network, such as the MLP (MultiLayer Perceptron), with that supplied by a recurrent neural network, such as the ENN (Elman Neural Network). Signal processing techniques based on the CQT (Constant-Q Transform) are used in order to create a time-frequency representation of the input signals. The processing phases involve non-negative matrix factorization (NMF) for onset detection. Since large scale tests were required, the whole process (synthesis of audio data generated starting from MIDI files, comparison of the results with the original score) has been automated. Test, validation and training sets have been generated with reference to three different musical styles respectively represented by J. S. Bach’s inventions, F. Chopin’s nocturnes and C. Debussy’s preludes.

Costantini, G., Todisco, M., Carota, M. (2010). Improving piano music transcription by elman dynamic neural networks. In Sensors and Microsystems (pp.387-390) [10.1007/978-90-481-3606-3_78].

Improving piano music transcription by elman dynamic neural networks

COSTANTINI, GIOVANNI;Todisco, M;Carota, M.

2010-01-01

Abstract

In this paper, we present two methods based on neural networks for the automatic transcription of polyphonic piano music. The input to these methods consists in live piano music acquired by a microphone, while the pitch of all the notes in the corresponding score forms the output. The aim of this work is to compare the accuracy achieved using a feed-forward neural network, such as the MLP (MultiLayer Perceptron), with that supplied by a recurrent neural network, such as the ENN (Elman Neural Network). Signal processing techniques based on the CQT (Constant-Q Transform) are used in order to create a time-frequency representation of the input signals. The processing phases involve non-negative matrix factorization (NMF) for onset detection. Since large scale tests were required, the whole process (synthesis of audio data generated starting from MIDI files, comparison of the results with the original score) has been automated. Test, validation and training sets have been generated with reference to three different musical styles respectively represented by J. S. Bach’s inventions, F. Chopin’s nocturnes and C. Debussy’s preludes.

Scheda breve

Scheda completa

Scheda completa (DC)

	Nome del convegno
	
				14th AISEM Italian conference sensors and microsystems
			
	Anno del convegno
	
				2009
			
	Rilevanza del convegno
	
				Rilevanza internazionale
			
	Data di pubblicazione
	
				2010
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1007/978-90-481-3606-3_78
			
	Settore disciplinare dell'intervento (valido fino a 24/06/2024)
	
				Settore ING-IND/31 - ELETTROTECNICA
			
	Lingua del contenuto
	
				English
			
	Tipologia
	
				Intervento a convegno
			
	Citazione
	
				Costantini, G., Todisco, M., Carota, M. (2010). Improving piano music transcription by elman dynamic neural networks. In Sensors and Microsystems (pp.387-390) [10.1007/978-90-481-3606-3_78].
			
	Tutti gli autori
	
						Costantini, G; Todisco, M; Carota, M
					
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/39571

Citazioni

ND

0

ND

social impact