Robustness has been traditionally stressed as a general desirable property of any computational model and system. The human NL interpretation device exhibits this property as the ability to deal with odd sentences. However, the difficulties in a theoretical explanation of robustness within the linguistic modelling suggested the adoption of an empirical notion. In this paper, we propose an empirical definition of robustness based on the notion of performance. Furthermore, a framework for controlling the parser robustness in the design phase is presented. The control is achieved via the adoption of two principles: the modularisation, typical of the software engineering practice, and the availability of domain adaptable components. The methodology has been adopted for the production of CHAOS, a pool of syntactic modules, which has been used in real applications. This pool of modules enables a large validation of the notion of empirical robustness, on the one side, and of the design methodology, on the other side, over different corpora and two different languages (English and Italian).
Basili, R., Zanzotto, F.m. (2002). Parsing engineering and empirical robustness. NATURAL LANGUAGE ENGINEERING, 8 - http://www.scimagojr.com/journalsearch.php?q=28380&tip=sid&clean=0, 97-120 [10.1017/S1351324902002875].
Parsing engineering and empirical robustness
BASILI, ROBERTO;ZANZOTTO, FABIO MASSIMO
2002-07-01
Abstract
Robustness has been traditionally stressed as a general desirable property of any computational model and system. The human NL interpretation device exhibits this property as the ability to deal with odd sentences. However, the difficulties in a theoretical explanation of robustness within the linguistic modelling suggested the adoption of an empirical notion. In this paper, we propose an empirical definition of robustness based on the notion of performance. Furthermore, a framework for controlling the parser robustness in the design phase is presented. The control is achieved via the adoption of two principles: the modularisation, typical of the software engineering practice, and the availability of domain adaptable components. The methodology has been adopted for the production of CHAOS, a pool of syntactic modules, which has been used in real applications. This pool of modules enables a large validation of the notion of empirical robustness, on the one side, and of the design methodology, on the other side, over different corpora and two different languages (English and Italian).File | Dimensione | Formato | |
---|---|---|---|
2002_NLE_BasiliZanzotto.pdf
accesso aperto
Licenza:
Copyright dell'editore
Dimensione
335.16 kB
Formato
Adobe PDF
|
335.16 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.