Online algorithms are an important class of learning machines as they are extremely simple and computationally efficient. Kernel methods versions can handle structured data, such as trees, and achieve state-of-the-art performance. However kernelized versions of Online Learning algorithms slow down when the number of support vectors becomes large. The traditional way to cope with this problem is introducing budgets that set the maximum number of support vectors. In this paper, we investigate Distributed Trees (DT) as an efficient way to use structured data in online learning. DTs effectively embed the huge feature space of the tree fragments into small vectors, so enabling the use of linear versions of kernel machines over tree structured data. We experiment with the Passive-Aggressive (PA) algorithm by comparing the linear and the kernelized version. A massive dataset made with tree structured data is employed: it is originated from a natural language processing task, the Boundary Detection in the context of Semantic Role Labeling over Frame Net. Results on a sample of the final data show that the DTs along with the Linear PA algorithm and the Tree Kernel along with the Bundgeted PA achieve comparable results in terms of f1-measure. Finally, the exploration of the full dataset allows the former to improve the performance on the classification task, with respect to the latter.

Filice, S., Croce, D., Basili, R., Zanzotto, F.m. (2013). Linear Online Learning over Structured Data with Distributed Tree Kernels. In Proceedings of International Conference on Machine Learning Applications (ICMLA) (pp.--) [10.1109/ICMLA.2013.28].

Linear Online Learning over Structured Data with Distributed Tree Kernels

CROCE, DANILO;BASILI, ROBERTO;ZANZOTTO, FABIO MASSIMO
2013-01-01

Abstract

Online algorithms are an important class of learning machines as they are extremely simple and computationally efficient. Kernel methods versions can handle structured data, such as trees, and achieve state-of-the-art performance. However kernelized versions of Online Learning algorithms slow down when the number of support vectors becomes large. The traditional way to cope with this problem is introducing budgets that set the maximum number of support vectors. In this paper, we investigate Distributed Trees (DT) as an efficient way to use structured data in online learning. DTs effectively embed the huge feature space of the tree fragments into small vectors, so enabling the use of linear versions of kernel machines over tree structured data. We experiment with the Passive-Aggressive (PA) algorithm by comparing the linear and the kernelized version. A massive dataset made with tree structured data is employed: it is originated from a natural language processing task, the Boundary Detection in the context of Semantic Role Labeling over Frame Net. Results on a sample of the final data show that the DTs along with the Linear PA algorithm and the Tree Kernel along with the Bundgeted PA achieve comparable results in terms of f1-measure. Finally, the exploration of the full dataset allows the former to improve the performance on the classification task, with respect to the latter.
International Conference on Machine Learning Applications (ICMLA)
Rilevanza internazionale
contributo
2013
Settore INF/01 - INFORMATICA
Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
English
Intervento a convegno
Filice, S., Croce, D., Basili, R., Zanzotto, F.m. (2013). Linear Online Learning over Structured Data with Distributed Tree Kernels. In Proceedings of International Conference on Machine Learning Applications (ICMLA) (pp.--) [10.1109/ICMLA.2013.28].
Filice, S; Croce, D; Basili, R; Zanzotto, Fm
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/98436
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact