Kernel-based learning has been largely applied to semantic textual inference tasks. In particular, Tree Kernels (TKs) are crucial in the modeling of syntactic similarity between linguistic instances in Question Answering or Information Extraction tasks. At the same time, lexical semantic information has been studied through the adoption of the so-called Distributional Semantics (DS) paradigm, where lexical vectors are acquired automatically from large corpora. Notice how methods to account for compositional linguistic structures (e.g. grammatically typed bi-grams or complex verb or noun phrases) have been proposed recently by defining algebras on lexical vectors. The result is an extended paradigm called Distributional Compositional Semantics (DCS). Although lexical extensions have been already proposed to generalize TKs towards semantic phenomena (e.g. the predicate argument structures as for role labeling), currently studied TKs do not account for compositionality, in general. In this paper, a novel kernel called Compositionally Smoothed Partial Tree Kernel is proposed to integrate DCS operators into the tree kernel evaluation, by acting both over lexical leaves and non-terminal, i.e. complex compositional, nodes. The empirical results obtained on a Question Classification and Paraphrase Identification tasks show that state-of-the-art performances can be achieved, without resorting to manual feature engineering, thus suggesting that a large set of Web and text mining tasks can be handled successfully by the kernel proposed here.

Annesi, P., Croce, D., Basili, R. (2014). Semantic compositionality in tree kernels. In CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management (pp.1029-1038). Association for Computing Machinery, Inc [10.1145/2661829.2661955].

Semantic compositionality in tree kernels

CROCE, DANILO;BASILI, ROBERTO
2014-11-01

Abstract

Kernel-based learning has been largely applied to semantic textual inference tasks. In particular, Tree Kernels (TKs) are crucial in the modeling of syntactic similarity between linguistic instances in Question Answering or Information Extraction tasks. At the same time, lexical semantic information has been studied through the adoption of the so-called Distributional Semantics (DS) paradigm, where lexical vectors are acquired automatically from large corpora. Notice how methods to account for compositional linguistic structures (e.g. grammatically typed bi-grams or complex verb or noun phrases) have been proposed recently by defining algebras on lexical vectors. The result is an extended paradigm called Distributional Compositional Semantics (DCS). Although lexical extensions have been already proposed to generalize TKs towards semantic phenomena (e.g. the predicate argument structures as for role labeling), currently studied TKs do not account for compositionality, in general. In this paper, a novel kernel called Compositionally Smoothed Partial Tree Kernel is proposed to integrate DCS operators into the tree kernel evaluation, by acting both over lexical leaves and non-terminal, i.e. complex compositional, nodes. The empirical results obtained on a Question Classification and Paraphrase Identification tasks show that state-of-the-art performances can be achieved, without resorting to manual feature engineering, thus suggesting that a large set of Web and text mining tasks can be handled successfully by the kernel proposed here.
23rd ACM International Conference on Information and Knowledge Management, CIKM 2014
chn
2014
ACM SIGWEB
Rilevanza internazionale
nov-2014
Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
Settore INF/01 - INFORMATICA
English
Information Systems and Management; Computer Science Applications1707 Computer Vision and Pattern Recognition; Information Systems
Intervento a convegno
Annesi, P., Croce, D., Basili, R. (2014). Semantic compositionality in tree kernels. In CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management (pp.1029-1038). Association for Computing Machinery, Inc [10.1145/2661829.2661955].
Annesi, P; Croce, D; Basili, R
File in questo prodotto:
File Dimensione Formato  
cikm2014_v3.4_final.pdf

solo utenti autorizzati

Licenza: Copyright dell'editore
Dimensione 266.37 kB
Formato Adobe PDF
266.37 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/124223
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 22
  • ???jsp.display-item.citation.isi??? ND
social impact