Elizabeth Schroeder

Also published as: Elizabeth Richerson, Elizabeth Schroeder Richerson

This paper describes different aspects of an open competition to evaluate multicultural name matching software, including the contest design, development of the test data, different phases of the competition, behavior of the participating teams, results of the competition, and lessons learned throughout. The competition, known as The MITRE Challengeâ¢, was informally announced at LREC 2010 and was recently concluded. Contest participants used the competition website (http://mitrechallenge.mitre.org) to download the competition data set and guidelines, upload results, and to view accuracy metrics for each result set submitted. Participants were allowed to submit unlimited result sets, with their top-scoring set determining their overall ranking. The competition website featured a leader board that displayed the top score for each participant, ranked according to the principal contest metric - mean average precision (MAP). MAP and other metrics were calculated in near-real time on a remote server, based on ground truth developed for the competition data set. Additional measures were taken to guard against gaming the competition metric or overfitting to the competition data set. Lessons learned during running this first MITRE Challenge will be valuable to others considering running similar evaluation campaigns.

2010

This paper describes the development and evaluation of enhancements to the specialized information retrieval capabilities of a multimodal reporting system. The system enables collection and dissemination of information through a distributed data architecture by allowing users to input free text documents, which are indexed for subsequent search and retrieval by other users. This unstructured data entry method is essential for users of this system, but it requires an intelligent support system for processing queries against the data. The system, known as TIGR (Tactical Ground Reporting), allows keyword searching and geospatial filtering of results, but lacked the ability to efficiently index and search person names and perform approximate name matching. To improve TIGRs ability to provide accurate, comprehensive results for queries on person names we iteratively updated existing entity extraction and name matching technologies to better align with the TIGR use case. We evaluated each version of the entity extraction and name matching components to find the optimal configuration for the TIGR context, and combined those pieces into a named entity extraction, indexing, and search module that integrates with the current TIGR system. By comparing system-level evaluations of the original and updated TIGR search processes, we show that our enhancements to personal name search significantly improved the performance of the overall information retrieval capabilities of the TIGR system.

2008

This paper describes a Name Matching Evaluation Laboratory that is a joint effort across multiple projects. The lab houses our evaluation infrastructure as well as multiple name matching engines and customized analytical tools. Included is an explanation of the methodology used by the lab to carry out evaluations. This methodology is based on standard information retrieval evaluation, which requires a carefully-constructed test data set. The paper describes how we created that test data set, including the ground truth used to score the systems performance. Descriptions and snapshots of the labs various tools are provided, as well as information on how the different tools are used throughout the evaluation process. By using this evaluation process, the lab has been able to identify strengths and weaknesses of different name matching engines. These findings have led the lab to an ongoing investigation into various techniques for combining results from multiple name matching engines to achieve optimal results, as well as into research on the more general problem of identity management and resolution.