Although distributional models of word meaning have been widely used in Information Retrieval achieving an effective representation and generalization schema of words in isolation, the composition of words in phrases or sentences is still a challenging task. Different methods have been proposed to account on syntactic structures to combine words in term of algebraic operators (e.g. tensor product) among vectors that represent lexical constituents. In this paper, a novel approach for semantic composition based on space projection techniques over the basic geometric lexical representations is proposed. In the geometric perspective here pursued, syntactic bi-grams are projected in the so called Support Subspace, aimed at emphasizing the semantic features shared by the compound words and better capturing phrase-specific aspects of the involved lexical meanings. State-of-the-art results are achieved in a well known benchmark for phrase similarity task and the generalization capability of the proposed operators is investigated in a cross-linguistic scenario, i.e. in the English and Italian Language.

Annesi, P., Storch, V., Croce, D., Basili, R. (2012). Algebraic compositional models for semantic similarity in ranking and clustering. In CEUR Workshop Proceedings (pp.155-166).

Algebraic compositional models for semantic similarity in ranking and clustering

CROCE, DANILO;BASILI, ROBERTO
2012-11-01

Abstract

Although distributional models of word meaning have been widely used in Information Retrieval achieving an effective representation and generalization schema of words in isolation, the composition of words in phrases or sentences is still a challenging task. Different methods have been proposed to account on syntactic structures to combine words in term of algebraic operators (e.g. tensor product) among vectors that represent lexical constituents. In this paper, a novel approach for semantic composition based on space projection techniques over the basic geometric lexical representations is proposed. In the geometric perspective here pursued, syntactic bi-grams are projected in the so called Support Subspace, aimed at emphasizing the semantic features shared by the compound words and better capturing phrase-specific aspects of the involved lexical meanings. State-of-the-art results are achieved in a well known benchmark for phrase similarity task and the generalization capability of the proposed operators is investigated in a cross-linguistic scenario, i.e. in the English and Italian Language.
3rd Italian Information Retrieval Workshop, IIR 2012
Bari, ita
2012
Ethica System S.r.l.
Rilevanza nazionale
nov-2012
Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
Settore INF/01 - INFORMATICA
English
Computer Science (all)
Intervento a convegno
Annesi, P., Storch, V., Croce, D., Basili, R. (2012). Algebraic compositional models for semantic similarity in ranking and clustering. In CEUR Workshop Proceedings (pp.155-166).
Annesi, P; Storch, V; Croce, D; Basili, R
File in questo prodotto:
File Dimensione Formato  
IIR2012_v3.1.pdf

solo utenti autorizzati

Licenza: Copyright dell'editore
Dimensione 385.38 kB
Formato Adobe PDF
385.38 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/124239
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact