Huseyin Alecakir
2024
Groningen Team A at SemEval-2024 Task 8: Human/Machine Authorship Attribution Using a Combination of Probabilistic and Linguistic Features
Huseyin Alecakir
|
Puja Chakraborty
|
Pontus Henningsson
|
Matthijs Van Hofslot
|
Alon Scheuer
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Our approach primarily centers on feature-based systems, where a diverse array of features pertinent to the text’s linguistic attributes is extracted. Alongside those, we incorporate token-level probabilistic features which are fed into a Bidirectional Long Short-Term Memory (BiLSTM) model. Both resulting feature arrays are concatenated and fed into our final prediction model. Our method under-performed compared to the baseline, despite the fact that previous attempts by others have successfully used linguistic features for the purpose of discerning machine-generated text. We conclude that our examined subset of linguistically motivated features alongside probabilistic features was not able to contribute almost any performance at all to a hybrid classifier of human and machine texts.
2022
TurkishDelightNLP: A Neural Turkish NLP Toolkit
Huseyin Alecakir
|
Necva Bölücü
|
Burcu Can
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations
We introduce a neural Turkish NLP toolkit called TurkishDelightNLP that performs computational linguistic analyses from morphological level to semantic level that involves tasks such as stemming, morphological segmentation, morphological tagging, part-of-speech tagging, dependency parsing, and semantic parsing, as well as high-level NLP tasks such as named entity recognition. We publicly share the open-source Turkish NLP toolkit through a web interface that allows an input text to be analysed in real-time, as well as the open source implementation of the components provided in the toolkit, an API, and several annotated datasets such as word similarity test set to evaluate word embeddings and UCCA-based semantic annotation in Turkish. This will be the first open-source Turkish NLP toolkit that involves a range of NLP tasks in all levels. We believe that it will be useful for other researchers in Turkish NLP and will be also beneficial for other high-level NLP tasks in Turkish.
Search
Co-authors
- Puja Chakraborty 1
- Pontus Henningsson 1
- Matthijs Van Hofslot 1
- Alon Scheuer 1
- Necva Bölücü 1
- show all...