A Hybrid Approach to Extracting and Classifying Verb+Noun Constructions

Abstract : We present the main findings and preliminary results of an ongoing project aimed at developing a system for collocation extraction based on contextual morpho-syntactic properties. We explored two hybrid extraction methods: the first method applies language-indepedent statistical techniques followed by a linguistic filtering, while the second approach, available only for German, is based on a set of lexico-syntactic patterns to extract collocation candidates. To define extraction and filtering patterns, we studied a specific collocation category, the Verb-Noun constructions, using a model inspired by the systemic functional grammar, proposing three level analysis: lexical, functional and semantic criteria. From tagged and lemmatized corpus, we identify some contextual morpho-syntactic properties helping to filter the output of the statistical methods and to extract some potential interesting VN constructions (complex predicates vs complex predicator). The extracted candidates are validated and classified manually.
Type de document :
Communication dans un congrès
Calzolari, Nicoletta et al. The 6th edition of the Language Resources and Evaluation Conference (LREC 2008), May 2008, Marrakech, Morocco. European Language Ressources Association (ELRA), Proceedings of the 6th Language Resources and Evaluation Conference - LREC 2008, 2008, 〈http://www.lrec-conf.org/lrec2008/〉
Liste complète des métadonnées

https://hal-univ-diderot.archives-ouvertes.fr/hal-01220400
Contributeur : Christopher Gledhill <>
Soumis le : lundi 26 octobre 2015 - 11:49:46
Dernière modification le : mercredi 14 mars 2018 - 16:38:53

Identifiants

  • HAL Id : hal-01220400, version 1

Collections

Citation

Amalia Todirascu, Dan Tufis, Ulrich Heid, Christopher Gledhill, Dan Stefânescu, et al.. A Hybrid Approach to Extracting and Classifying Verb+Noun Constructions. Calzolari, Nicoletta et al. The 6th edition of the Language Resources and Evaluation Conference (LREC 2008), May 2008, Marrakech, Morocco. European Language Ressources Association (ELRA), Proceedings of the 6th Language Resources and Evaluation Conference - LREC 2008, 2008, 〈http://www.lrec-conf.org/lrec2008/〉. 〈hal-01220400〉

Partager

Métriques

Consultations de la notice

90