Patrick Saint-Dizier

Also published as: Patrick Saint Dizier

2017

Using Question-Answering Techniques to Implement a Knowledge-Driven Argument Mining Approach
Patrick Saint-Dizier
Proceedings of the 4th Workshop on Argument Mining

This short paper presents a first implementation of a knowledge-driven argument mining approach. The major processing steps and language resources of the system are surveyed. An indicative evaluation outlines challenges and improvement directions.

2016

pdf bib abs

Error Typology and Remediation Strategies for Requirements Written in English by Non-Native Speakers
Marie Garnier | Patrick Saint-Dizier
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In most international industries, English is the main language of communication for technical documents. These documents are designed to be as unambiguous as possible for their users. For international industries based in non-English speaking countries, the professionals in charge of writing requirements are often non-native speakers of English, who rarely receive adequate training in the use of English for this task. As a result, requirements can contain a relatively large diversity of lexical and grammatical errors, which are not eliminated by the use of guidelines from controlled languages. This article investigates the distribution of errors in a corpus of requirements written in English by native speakers of French. Errors are defined on the basis of grammaticality and acceptability principles, and classified using comparable categories. Results show a high proportion of errors in the Noun Phrase, notably through modifier stacking, and errors consistent with simplification strategies. Comparisons with similar corpora in other genres reveal the specificity of the distribution of errors in requirements. This research also introduces possible applied uses, in the form of strategies for the automatic detection of errors, and in-person training provided by certification boards in requirements authoring.

pdf bib abs

Argument Mining: the Bottleneck of Knowledge and Language Resources
Patrick Saint-Dizier
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Given a controversial issue, argument mining from natural language texts (news papers, and any form of text on the Internet) is extremely challenging: domain knowledge is often required together with appropriate forms of inferences to identify arguments. This contribution explores the types of knowledge that are required and how they can be paired with reasoning schemes, language processing and language resources to accurately mine arguments. We show via corpus analysis that the Generative Lexicon, enhanced in different manners and viewed as both a lexicon and a domain knowledge representation, is a relevant approach. In this paper, corpus annotation for argument mining is first developed, then we show how the generative lexicon approach must be adapted and how it can be paired with language processing patterns to extract and specify the nature of arguments. Our approach to argument mining is thus knowledge driven.

pdf bib abs

LELIO: An Auto-Adaptative System to Acquire Domain Lexical Knowledge in Technical Texts
Patrick Saint-Dizier
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper, we investigate some language acquisition facets of an auto-adaptative system that can automatically acquire most of the relevant lexical knowledge and authoring practices for an application in a given domain. This is the LELIO project: producing customized LELIE solutions. Our goal, within the framework of LELIE (a system that tags language uses that do not follow the Constrained Natural Language principles), is to automate the long, costly and error prone lexical customization of LELIE to a given application domain. Technical texts being relatively restricted in terms of syntax and lexicon, results obtained show that this approach is feasible and relatively reliable. By auto-adaptative, we mean that the system learns from a sample of the application corpus the various lexical terms and uses crucial for LELIE to work properly (e.g. verb uses, fuzzy terms, business terms, stylistic patterns). A technical writer validation method is developed at each step of the acquisition.