One of the most important research area in Natural Language Processing concerns the modeling of semantics expressed in text. Since foundational work in Natural Language Understanding has shown that a deep semantic approach is still not feasible, current research is focused on shallow methods combining linguistic models and machine learning techniques. The latter aim at learning semantic models, like those that can detect the entailment between the meaning of two text fragments, by means of training examples described by specific features. These are rather difficult to design since there is no linguistic model that can effectively encode the lexico-syntactic level of a sentence and its corresponding semantic models. Thus, the adopted solution consists in exhaustively describing training examples by means of all possible combinations of sentence words and syntactic information. The latter, typically expressed as parse trees of text fragments, is often encoded in the learning process using graph algorithms. In this paper, we propose a class of graphs, the tripartite directed acyclic graphs (tDAGs), which can be efficiently used to design algorithms for graph kernels for semantic natural language tasks involving sentence pairs. These model the matching between two pairs of syntactic trees in terms of all possible graph fragments. Interestingly, since tDAGs encode the association between identical or similar words (i.e. variables), it can be used to represent and learn first-order rules, i.e. rules describable by first-order logic. We prove that our matching function is a valid kernel and we empirically show that, although its evaluation is still exponential in the worst case, it is extremely efficient and more accurate than the previously proposed kernels.

Zanzotto, F.m., Dell'Arciprete, L., Moschitti, A. (2011). Efficient graph Kernels for textual entailment recognition. FUNDAMENTA INFORMATICAE, 107 - ( Computer Science:Information Systems Q2 http://www.scimagojr.com/journalsearch.php?q=28474&tip=sid&clean=0 )(2-3), 199-222 [10.3233/FI-2011-400].

Efficient graph Kernels for textual entailment recognition

ZANZOTTO, FABIO MASSIMO;
2011-01-01

Abstract

One of the most important research area in Natural Language Processing concerns the modeling of semantics expressed in text. Since foundational work in Natural Language Understanding has shown that a deep semantic approach is still not feasible, current research is focused on shallow methods combining linguistic models and machine learning techniques. The latter aim at learning semantic models, like those that can detect the entailment between the meaning of two text fragments, by means of training examples described by specific features. These are rather difficult to design since there is no linguistic model that can effectively encode the lexico-syntactic level of a sentence and its corresponding semantic models. Thus, the adopted solution consists in exhaustively describing training examples by means of all possible combinations of sentence words and syntactic information. The latter, typically expressed as parse trees of text fragments, is often encoded in the learning process using graph algorithms. In this paper, we propose a class of graphs, the tripartite directed acyclic graphs (tDAGs), which can be efficiently used to design algorithms for graph kernels for semantic natural language tasks involving sentence pairs. These model the matching between two pairs of syntactic trees in terms of all possible graph fragments. Interestingly, since tDAGs encode the association between identical or similar words (i.e. variables), it can be used to represent and learn first-order rules, i.e. rules describable by first-order logic. We prove that our matching function is a valid kernel and we empirically show that, although its evaluation is still exponential in the worst case, it is extremely efficient and more accurate than the previously proposed kernels.
2011
Pubblicato
Rilevanza internazionale
Articolo
Sì, ma tipo non specificato
Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
English
Con Impact Factor ISI
Zanzotto, F.m., Dell'Arciprete, L., Moschitti, A. (2011). Efficient graph Kernels for textual entailment recognition. FUNDAMENTA INFORMATICAE, 107 - ( Computer Science:Information Systems Q2 http://www.scimagojr.com/journalsearch.php?q=28474&tip=sid&clean=0 )(2-3), 199-222 [10.3233/FI-2011-400].
Zanzotto, Fm; Dell'Arciprete, L; Moschitti, A
Articolo su rivista
File in questo prodotto:
File Dimensione Formato  
Zanzotto-Moschitti_versioneFinale_dellaRivista.pdf

solo utenti autorizzati

Descrizione: Articolo principale
Licenza: Copyright dell'editore
Dimensione 244.93 kB
Formato Adobe PDF
244.93 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/23296
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 1
social impact