Sinead Madden
2024
Proceedings of the 1st Workshop on Simulating Conversational Intelligence in Chat (SCI-CHAT 2024)
Yvette Graham
|
Qun Liu
|
Gerasimos Lampouras
|
Ignacio Iacobacci
|
Sinead Madden
|
Haider Khalid
|
Rameez Qureshi
Proceedings of the 1st Workshop on Simulating Conversational Intelligence in Chat (SCI-CHAT 2024)
2022
Building Machine Translation System for Software Product Descriptions Using Domain-specific Sub-corpora Extraction
Pintu Lohar
|
Sinead Madden
|
Edmond O’Connor
|
Maja Popovic
|
Tanya Habruseva
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Building Machine Translation systems for a specific domain requires a sufficiently large and good quality parallel corpus in that domain. However, this is a bit challenging task due to the lack of parallel data in many domains such as economics, science and technology, sports etc. In this work, we build English-to-French translation systems for software product descriptions scraped from LinkedIn website. Moreover, we developed a first-ever test parallel data set of product descriptions. We conduct experiments by building a baseline translation system trained on general domain and then domain-adapted systems using sentence-embedding based corpus filtering and domain-specific sub-corpora extraction. All the systems are tested on our newly developed data set mentioned earlier. Our experimental evaluation reveals that the domain-adapted model based on our proposed approaches outperforms the baseline.
Search
Co-authors
- Pintu Lohar 1
- Edmond O’Connor 1
- Maja Popović 1
- Tanya Habruseva 1
- Yvette Graham 1
- show all...