Syeda Jannatus Saba

Also published as: Syeda Jannatus Saba


2025

pdf bib
Evaluating Language Translation Models by Playing Telephone
Syeda Jannatus Saba | Steven Skiena
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Our ability to efficiently and accurately evaluate the quality of machine translation systems has been outrun by the effectiveness of current language models—which limits the potential for further improving these models on more challenging tasks like long-form and literary translation. We propose an unsupervised method to generate training data for translation evaluation over different document lengths and application domains by repeated rounds of translation between source and target languages. We evaluate evaluation systems trained on texts mechanically generated using both model rotation and language translation approaches, demonstrating improved performance over a popular translation evaluation system (xCOMET) on two different tasks: (i) scoring the quality of a given translation against a human reference and (ii) selecting which of two translations is generationally closer to an original source document.

2021

pdf bib
Syntax and Themes: How Context Free Grammar Rules and Semantic Word Association Influence Book Success
Henry Gorelick | Biddut Sarker Bijoy | Syeda Jannatus Saba | Sudipta Kar | Md Saiful Islam | Mohammad Ruhul Amin
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

In this paper, we attempt to improve upon the state-of-the-art in predicting a novel’s success by modeling the lexical semantic relationships of its contents. We created the largest dataset used in such a project containing lexical data from 17,962 books from Project Gutenberg. We utilized domain specific feature reduction techniques to implement the most accurate models to date for predicting book success, with our best model achieving an average accuracy of 94.0%. By analyzing the model parameters, we extracted the successful semantic relationships from books of 12 different genres. We finally mapped those semantic relations to a set of themes, as defined in Roget’s Thesaurus and discovered the themes that successful books of a given genre prioritize. At the end of the paper, we further showed that our model demonstrate similar performance for book success prediction even when Goodreads rating was used instead of download count to measure success.

pdf bib
A Study on Using Semantic Word Associations to Predict the Success of a Novel
Syeda Jannatus Saba | Biddut Sarker Bijoy | Henry Gorelick | Sabir Ismail | Md Saiful Islam | Mohammad Ruhul Amin
Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics

Many new books get published every year, and only a fraction of them become popular among the readers. So the prediction of a book success can be a very useful parameter for publishers to make a reliable decision. This article presents the study of semantic word associations using the word embedding of book content for a set of Roget’s thesaurus concepts for book success prediction. In this work, we discuss the method to represent a book as a spectrum of concepts based on the association score between its content embedding and a global embedding (i.e. fastText) for a set of semantically linked word clusters. We show that the semantic word associations outperform the previous methods for book success prediction. In addition, we present that semantic word associations also provide better results than using features like the frequency of word groups in Roget’s thesaurus, LIWC (a popular tool for linguistic inquiry and word count), NRC (word association emotion lexicon), and part of speech (PoS). Our study reports that concept associations based on Roget’s Thesaurus using word embedding of individual novel resulted in the state-of-the-art performance of 0.89 average weighted F1-score for book success prediction. Finally, we present a set of dominant themes that contribute towards the popularity of a book for a specific genre.