Kernel-based and Deep Learning methods are two of the most popular approaches in Computational Natural Language Learning. Although these models are rather different and characterized by distinct strong and weak aspects, they both had impressive impact on the accuracy of complex Natural Language Processing tasks. An advantage of kernel-based methods is their capability of exploiting structured information induced from examples. For instance, Sequence or Tree kernels operate over structures reflecting linguistic evidence, such as syntactic information encoded in syntactic parse trees. Deep Learning approaches are very effective as they can learn non-linear decision functions: however, general models require input instances to be explicitly modeled via vectors or tensors, and operating on structured data is made possible only by using ad-hoc architectures. In this work, we discuss a novel architecture that efficiently combines kernel methods and neural networks, in the attempt at squeezing the best from the two paradigms. The so-called Kernel-based Deep Architecture (KDA) adopts a Nyström-based projection function to approximate any valid kernel function and convert any structure they operate on (for instance, linguistic structures, such as trees) into dense linear embeddings. These can be used as input of a Deep Feed-forward Neural Network that exploits such embeddings to learn non-linear classification functions. KDA is a mathematically justified integration of expressive kernel functions and deep neural architectures, with several advantages: it (i) directly operates over complex non-tensor structures, e.g., trees, without ad hoc manual feature engineering or architectural design, (ii) achieves a drastic reduction of the computational cost w.r.t. pure kernel methods, and (iii) exploits the non-linearity of Deep Architectures to produce accurate models. We experimented the KDA in three rather different semantic inference tasks: Semantic Parsing, Question Classification, and Community Question Answering. Results show that the KDA achieves state-of-the-art accuracy, with a computational cost that is much lower than the one necessary to train and test a pure kernel-based method, such as the SVM algorithm.

Croce, D., Filice, S., Basili, R. (2019). Making sense of kernel spaces in neural learning. COMPUTER SPEECH AND LANGUAGE, 58, 51-75 [10.1016/j.csl.2019.03.006].

Making sense of kernel spaces in neural learning

Croce D.;Filice S.;Basili R.
2019-04-01

Abstract

Kernel-based and Deep Learning methods are two of the most popular approaches in Computational Natural Language Learning. Although these models are rather different and characterized by distinct strong and weak aspects, they both had impressive impact on the accuracy of complex Natural Language Processing tasks. An advantage of kernel-based methods is their capability of exploiting structured information induced from examples. For instance, Sequence or Tree kernels operate over structures reflecting linguistic evidence, such as syntactic information encoded in syntactic parse trees. Deep Learning approaches are very effective as they can learn non-linear decision functions: however, general models require input instances to be explicitly modeled via vectors or tensors, and operating on structured data is made possible only by using ad-hoc architectures. In this work, we discuss a novel architecture that efficiently combines kernel methods and neural networks, in the attempt at squeezing the best from the two paradigms. The so-called Kernel-based Deep Architecture (KDA) adopts a Nyström-based projection function to approximate any valid kernel function and convert any structure they operate on (for instance, linguistic structures, such as trees) into dense linear embeddings. These can be used as input of a Deep Feed-forward Neural Network that exploits such embeddings to learn non-linear classification functions. KDA is a mathematically justified integration of expressive kernel functions and deep neural architectures, with several advantages: it (i) directly operates over complex non-tensor structures, e.g., trees, without ad hoc manual feature engineering or architectural design, (ii) achieves a drastic reduction of the computational cost w.r.t. pure kernel methods, and (iii) exploits the non-linearity of Deep Architectures to produce accurate models. We experimented the KDA in three rather different semantic inference tasks: Semantic Parsing, Question Classification, and Community Question Answering. Results show that the KDA achieves state-of-the-art accuracy, with a computational cost that is much lower than the one necessary to train and test a pure kernel-based method, such as the SVM algorithm.
1-apr-2019
Pubblicato
Rilevanza internazionale
Articolo
Esperti anonimi
Settore INF/01 - INFORMATICA
Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
English
Kernel-based learning; Neural methods; Nystrom embeddings; Semantic spaces
Croce, D., Filice, S., Basili, R. (2019). Making sense of kernel spaces in neural learning. COMPUTER SPEECH AND LANGUAGE, 58, 51-75 [10.1016/j.csl.2019.03.006].
Croce, D; Filice, S; Basili, R
Articolo su rivista
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0885230818301244-main_published.pdf

solo utenti autorizzati

Licenza: Copyright dell'editore
Dimensione 1.57 MB
Formato Adobe PDF
1.57 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/238128
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact