Aggregation methods for efficient collocation detection

Anca Dinu, Liviu Dinu, Ionut Sorodoc


Abstract
In this article we propose a rank aggregation method for the task of collocations detection. It consists of applying some well-known methods (e.g. Dice method, chi-square test, z-test and likelihood ratio) and then aggregating the resulting collocations rankings by rank distance and Borda score. These two aggregation methods are especially well suited for the task, since the results of each individual method naturally forms a ranking of collocations. Combination methods are known to usually improve the results, and indeed, the proposed aggregation method performs better then each individual method taken in isolation.
Anthology ID:
L14-1128
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
4041–4045
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/1184_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Anca Dinu, Liviu Dinu, and Ionut Sorodoc. 2014. Aggregation methods for efficient collocation detection. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 4041–4045, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Aggregation methods for efficient collocation detection (Dinu et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/1184_Paper.pdf