We present the results of the BioCreative II.5 evaluation in association with the FEBS Letters experiment, where authors created Structured Digital Abstracts to capture information about protein-protein interactions. The BioCreative II.5 challenge evaluated automatic annotations from 15 text mining teams based on a gold standard created by reconciling annotations from curators, authors, and automated systems. The tasks were to rank articles for curation based on curatable protein-protein interactions; to identify the interacting proteins (using UniProt identifiers) in the positive articles (61); and to identify interacting protein pairs. There were 595 full-text articles in the evaluation test set, including those both with and without curatable protein interactions. The principal evaluation metrics were the interpolated area under the precision/recall curve (AUC iP/R), and (balanced) F-measure. For article classification, the best AUC iP/R was 0.70; for interacting proteins, the best system achieved good macroaveraged recall (0.73) and interpolated area under the precision/recall curve (0.58), after filtering incorrect species and mapping homonymous orthologs; for interacting protein pairs, the top (filtered, mapped) recall was 0.42 and AUC iP/R was 0.29. Ensemble systems improved performance for the interacting protein task.

Leitner, F., Mardis, S., Krallinger, M., Cesareni, G., Hirschman, L., Valencia, A. (2010). An Overview of BioCreative II.5, 7(3), 385-399 [10.1109/TCBB.2010.61].

An Overview of BioCreative II.5

CESARENI, GIOVANNI;
2010-01-01

Abstract

We present the results of the BioCreative II.5 evaluation in association with the FEBS Letters experiment, where authors created Structured Digital Abstracts to capture information about protein-protein interactions. The BioCreative II.5 challenge evaluated automatic annotations from 15 text mining teams based on a gold standard created by reconciling annotations from curators, authors, and automated systems. The tasks were to rank articles for curation based on curatable protein-protein interactions; to identify the interacting proteins (using UniProt identifiers) in the positive articles (61); and to identify interacting protein pairs. There were 595 full-text articles in the evaluation test set, including those both with and without curatable protein interactions. The principal evaluation metrics were the interpolated area under the precision/recall curve (AUC iP/R), and (balanced) F-measure. For article classification, the best AUC iP/R was 0.70; for interacting proteins, the best system achieved good macroaveraged recall (0.73) and interpolated area under the precision/recall curve (0.58), after filtering incorrect species and mapping homonymous orthologs; for interacting protein pairs, the top (filtered, mapped) recall was 0.42 and AUC iP/R was 0.29. Ensemble systems improved performance for the interacting protein task.
2010
Pubblicato
Rilevanza internazionale
Articolo
Sì, ma tipo non specificato
Settore BIO/18 - GENETICA
English
Con Impact Factor ISI
Data Mining; Computational Biology; Database Management Systems; Protein Interaction Mapping; Natural Language Processing; Databases, Factual; Information Management; Data Collection; Abstracting and Indexing as Topic
Leitner, F., Mardis, S., Krallinger, M., Cesareni, G., Hirschman, L., Valencia, A. (2010). An Overview of BioCreative II.5, 7(3), 385-399 [10.1109/TCBB.2010.61].
Leitner, F; Mardis, S; Krallinger, M; Cesareni, G; Hirschman, L; Valencia, A
Articolo su rivista
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/19017
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 88
  • ???jsp.display-item.citation.isi??? 74
social impact