Robust language learning via efficient budgeted online algorithms

IRIS

In many Natural Language Processing tasks, kernel learning allows to define robust and effective systems. At the same time, Online Learning Algorithms are appealing for their incremental and continuous learning capability. They allow to follow a target problem, with a constant adaptation to a dynamic environment. The drawback of using kernels in online settings is the continuous complexity growth, in terms of time and memory usage, experienced both in the learning and classification phases. In this paper, we extend a state-of-the-art Budgeted Online Learning Algorithm that efficiently constraints the overall complexity. We introduce the principles of Fairness and Weight Adjustment: the former mitigates the effect of unbalanced datasets, while the latter improves the stability of the resulting models. The usage of robust semantic kernel functions in Sentiment Analysis in Twitter improves the results with respect to the standard budgeted formulation. Performances are comparable with one of the most efficient Support Vector Machine implementations, still preserving all the advantages of online methods. Results are straightforward considering that the task has been tackled without manually coded resources (e.g. Word Net or a Polarity Lexicon) but mainly exploiting distributional analysis of unlabeled corpora. © 2013 IEEE.

Filice, S., Castellucci, G., Croce, D., Basili, R. (2013). Robust language learning via efficient budgeted online algorithms. In Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013 (pp.913-920). IEEE Computer Society [10.1109/ICDMW.2013.87].

Robust language learning via efficient budgeted online algorithms

Filice, S;Castellucci, G;CROCE, DANILO;BASILI, ROBERTO

2013-12-01

Abstract

In many Natural Language Processing tasks, kernel learning allows to define robust and effective systems. At the same time, Online Learning Algorithms are appealing for their incremental and continuous learning capability. They allow to follow a target problem, with a constant adaptation to a dynamic environment. The drawback of using kernels in online settings is the continuous complexity growth, in terms of time and memory usage, experienced both in the learning and classification phases. In this paper, we extend a state-of-the-art Budgeted Online Learning Algorithm that efficiently constraints the overall complexity. We introduce the principles of Fairness and Weight Adjustment: the former mitigates the effect of unbalanced datasets, while the latter improves the stability of the resulting models. The usage of robust semantic kernel functions in Sentiment Analysis in Twitter improves the results with respect to the standard budgeted formulation. Performances are comparable with one of the most efficient Support Vector Machine implementations, still preserving all the advantages of online methods. Results are straightforward considering that the task has been tackled without manually coded resources (e.g. Word Net or a Polarity Lexicon) but mainly exploiting distributional analysis of unlabeled corpora. © 2013 IEEE.

Scheda breve

Scheda completa

Scheda completa (DC)

	Nome del convegno
	
			2013 13th IEEE International Conference on Data Mining Workshops, ICDMW 2013
		
	Luogo del convegno
	
			Dallas, TX, usa
		
	Anno del convegno
	
			2013
		
	Rilevanza del convegno
	
			Rilevanza internazionale
		
	Data di pubblicazione
	
			dic-2013
		
	DOI dell'intervento
	
			https://dx.doi.org/10.1109/ICDMW.2013.87
		
	Settore disciplinare dell'intervento
	
			Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
Settore INF/01 - INFORMATICA
		
	Lingua del contenuto
	
			English
		
	Parole chiave
	
			Online learning; Sentiment analysis; Software
		
	Tipologia
	
			Intervento a convegno
		
	Citazione
	
			Filice, S., Castellucci, G., Croce, D., Basili, R. (2013). Robust language learning via efficient budgeted online algorithms. In Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013 (pp.913-920). IEEE Computer Society [10.1109/ICDMW.2013.87].
		
	Tutti gli autori
	
			Filice, S; Castellucci, G; Croce, D; Basili, R
		
	Appare nelle tipologie:
	
			02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
SENTIRE_2013_dm706.pdf solo utenti autorizzati Licenza: Copyright dell'editore Dimensione 382.73 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	382.73 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/124267

Citazioni

ND

0

0

social impact