Semantic compositionality in tree kernels

IRIS

Kernel-based learning has been largely applied to semantic textual inference tasks. In particular, Tree Kernels (TKs) are crucial in the modeling of syntactic similarity between linguistic instances in Question Answering or Information Extraction tasks. At the same time, lexical semantic information has been studied through the adoption of the so-called Distributional Semantics (DS) paradigm, where lexical vectors are acquired automatically from large corpora. Notice how methods to account for compositional linguistic structures (e.g. grammatically typed bi-grams or complex verb or noun phrases) have been proposed recently by defining algebras on lexical vectors. The result is an extended paradigm called Distributional Compositional Semantics (DCS). Although lexical extensions have been already proposed to generalize TKs towards semantic phenomena (e.g. the predicate argument structures as for role labeling), currently studied TKs do not account for compositionality, in general. In this paper, a novel kernel called Compositionally Smoothed Partial Tree Kernel is proposed to integrate DCS operators into the tree kernel evaluation, by acting both over lexical leaves and non-terminal, i.e. complex compositional, nodes. The empirical results obtained on a Question Classification and Paraphrase Identification tasks show that state-of-the-art performances can be achieved, without resorting to manual feature engineering, thus suggesting that a large set of Web and text mining tasks can be handled successfully by the kernel proposed here.

Annesi, P., Croce, D., Basili, R. (2014). Semantic compositionality in tree kernels. In CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management (pp.1029-1038). Association for Computing Machinery, Inc [10.1145/2661829.2661955].

Semantic compositionality in tree kernels

Annesi, P;CROCE, DANILO;BASILI, ROBERTO

2014-11-01

Abstract

Kernel-based learning has been largely applied to semantic textual inference tasks. In particular, Tree Kernels (TKs) are crucial in the modeling of syntactic similarity between linguistic instances in Question Answering or Information Extraction tasks. At the same time, lexical semantic information has been studied through the adoption of the so-called Distributional Semantics (DS) paradigm, where lexical vectors are acquired automatically from large corpora. Notice how methods to account for compositional linguistic structures (e.g. grammatically typed bi-grams or complex verb or noun phrases) have been proposed recently by defining algebras on lexical vectors. The result is an extended paradigm called Distributional Compositional Semantics (DCS). Although lexical extensions have been already proposed to generalize TKs towards semantic phenomena (e.g. the predicate argument structures as for role labeling), currently studied TKs do not account for compositionality, in general. In this paper, a novel kernel called Compositionally Smoothed Partial Tree Kernel is proposed to integrate DCS operators into the tree kernel evaluation, by acting both over lexical leaves and non-terminal, i.e. complex compositional, nodes. The empirical results obtained on a Question Classification and Paraphrase Identification tasks show that state-of-the-art performances can be achieved, without resorting to manual feature engineering, thus suggesting that a large set of Web and text mining tasks can be handled successfully by the kernel proposed here.

Scheda breve

Scheda completa

Scheda completa (DC)

	Nome del convegno
	
				23rd ACM International Conference on Information and Knowledge Management, CIKM 2014
			
	Luogo del convegno
	
				chn
			
	Anno del convegno
	
				2014
			
	Organizzatore/i del convegno
	
				ACM SIGWEB
			
	Rilevanza del convegno
	
				Rilevanza internazionale
			
	Data di pubblicazione
	
				nov-2014
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1145/2661829.2661955
			
	Settore disciplinare dell'intervento (valido fino a 24/06/2024)
	
				Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
Settore INF/01 - INFORMATICA
			
	Lingua del contenuto
	
				English
			
	Parole chiave
	
				Information Systems and Management; Computer Science Applications1707 Computer Vision and Pattern Recognition; Information Systems
			
	Tipologia
	
				Intervento a convegno
			
	Citazione
	
				Annesi, P., Croce, D., Basili, R. (2014). Semantic compositionality in tree kernels. In CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management (pp.1029-1038). Association for Computing Machinery, Inc [10.1145/2661829.2661955].
			
	Tutti gli autori
	
						Annesi, P; Croce, D; Basili, R
					
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
cikm2014_v3.4_final.pdf solo utenti autorizzati Licenza: Copyright dell'editore Dimensione 266.37 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	266.37 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/124223

Citazioni

ND

22

ND

social impact