Lawrence Cavedon
2025
Evaluating Numeracy of Language Models as a Natural Language Inference Task
Rahmad Mahendra | Damiano Spina | Lawrence Cavedon | Karin Verspoor
Findings of the Association for Computational Linguistics: NAACL 2025
Rahmad Mahendra | Damiano Spina | Lawrence Cavedon | Karin Verspoor
Findings of the Association for Computational Linguistics: NAACL 2025
While recent advancements in large language models (LLMs) have enhanced their capabilities to solve mathematical problems, other aspects of numeracy remain underexplored. In this paper, we propose a benchmark to evaluate the ability of language models to perform basic numeracy tasks. We frame numeracy as a Natural Language Inference (NLI) task to assess the models’ ability to understand both numbers and language contexts. We evaluate 49 language models (LMs), including fine-tuned LMs on NLI datasets, instruction-tuned LLMs, and specialized math-LLMs. Our findings reveal three main insights: (1) LLMs only clearly outperform smaller LMs in arithmetic tasks, indicating that mathematical reasoning cannot be generalized to other numeracy skills such as number comparison and normalization; (2) while most language models achieve fair to good accuracy for NLI entailment cases, they still struggle to predict contradiction and neutral cases; and (3) the robustness of language models’ numeracy capabilities needs improvement, particularly in understanding the semantics and pragmatics of numbers in linguistic contexts.
2024
Do Numbers Matter? Types and Prevalence of Numbers in Clinical Texts
Rahmad Mahendra | Damiano Spina | Lawrence Cavedon | Karin Verspoor
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
Rahmad Mahendra | Damiano Spina | Lawrence Cavedon | Karin Verspoor
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
In this short position paper, we highlight the importance of numbers in clinical text. We first present a taxonomy of number variants. We then perform corpus analysis to analyze characteristics of number use in several clinical corpora. Based on our findings of extensive use of numbers, and limited understanding of the impact of numbers on clinical NLP tasks, we identify the need for a public benchmark that will support investigation of numerical processing tasks for the clinical domain.
2018
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
Kazunori Komatani | Diane Litman | Kai Yu | Alex Papangelis | Lawrence Cavedon | Mikio Nakano
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
Kazunori Komatani | Diane Litman | Kai Yu | Alex Papangelis | Lawrence Cavedon | Mikio Nakano
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
2016
A Corpus of Tables in Full-Text Biomedical Research Publications
Tatyana Shmanina | Ingrid Zukerman | Ai Lee Cheam | Thomas Bochynek | Lawrence Cavedon
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)
Tatyana Shmanina | Ingrid Zukerman | Ai Lee Cheam | Thomas Bochynek | Lawrence Cavedon
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)
The development of text mining techniques for biomedical research literature has received increased attention in recent times. However, most of these techniques focus on prose, while much important biomedical data reside in tables. In this paper, we present a corpus created to serve as a gold standard for the development and evaluation of techniques for the automatic extraction of information from biomedical tables. We describe the guidelines used for corpus annotation and the manner in which they were developed. The high inter-annotator agreement achieved on the corpus, and the generic nature of our annotation approach, suggest that the developed guidelines can serve as a general framework for table annotation in biomedical and other scientific domains. The annotated corpus and the guidelines are available at http://www.csse.monash.edu.au/research/umnl/data/index.shtml.
2014
Challenges in Information Extraction from Tables in Biomedical Research Publications: a Dataset Analysis
Tatyana Shmanina | Lawrence Cavedon | Ingrid Zukerman
Proceedings of the Australasian Language Technology Association Workshop 2014
Tatyana Shmanina | Lawrence Cavedon | Ingrid Zukerman
Proceedings of the Australasian Language Technology Association Workshop 2014
2013
Impact of Corpus Diversity and Complexity on NER Performance
Tatyana Shmanina | Ingrid Zukerman | Antonio Jimeno Yepes | Lawrence Cavedon | Karin Verspoor
Proceedings of the Australasian Language Technology Association Workshop 2013 (ALTA 2013)
Tatyana Shmanina | Ingrid Zukerman | Antonio Jimeno Yepes | Lawrence Cavedon | Karin Verspoor
Proceedings of the Australasian Language Technology Association Workshop 2013 (ALTA 2013)
2012
Strategies for Mixed-Initiative Conversation Management using Question-Answer Pairs
Wilson Wong | Lawrence Cavedon | John Thangarajah | Lin Padgham
Proceedings of COLING 2012
Wilson Wong | Lawrence Cavedon | John Thangarajah | Lin Padgham
Proceedings of COLING 2012
Classifying Dialogue Acts in Multi-party Live Chats
Su Nam Kim | Lawrence Cavedon | Timothy Baldwin
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation
Su Nam Kim | Lawrence Cavedon | Timothy Baldwin
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation
2011
Classifying Domain-Specific Terms Using a Dictionary
Su Nam Kim | Lawrence Cavedon
Proceedings of the Australasian Language Technology Association Workshop 2011
Su Nam Kim | Lawrence Cavedon
Proceedings of the Australasian Language Technology Association Workshop 2011
2010
Classifying Dialogue Acts in One-on-One Live Chats
Su Nam Kim | Lawrence Cavedon | Timothy Baldwin
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Su Nam Kim | Lawrence Cavedon | Timothy Baldwin
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Generating Shifting Sentiment for a Conversational Agent
Simon Whitehead | Lawrence Cavedon
Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text
Simon Whitehead | Lawrence Cavedon
Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text
“Hello Emily, How Are You Today?” - Personalised Dialogue in a Toy to Engage Children.
Carole Adam | Lawrence Cavedon | Lin Padgham
Proceedings of the 2010 Workshop on Companionable Dialogue Systems
Carole Adam | Lawrence Cavedon | Lin Padgham
Proceedings of the 2010 Workshop on Companionable Dialogue Systems
2009
Extraction of Named Entities from Tables in Gene Mutation Literature
Wern Wong | David Martinez | Lawrence Cavedon
Proceedings of the BioNLP 2009 Workshop
Wern Wong | David Martinez | Lawrence Cavedon
Proceedings of the BioNLP 2009 Workshop
2007
Exploring Abbreviation Expansion for Genomic Information Retrieval
Nicola Stokes | Yi Li | Lawrence Cavedon | Justin Zobel
Proceedings of the Australasian Language Technology Workshop 2007
Nicola Stokes | Yi Li | Lawrence Cavedon | Justin Zobel
Proceedings of the Australasian Language Technology Workshop 2007
2006
Proceedings of the Australasian Language Technology Workshop 2006
Lawrence Cavedon | Ingrid Zukerman
Proceedings of the Australasian Language Technology Workshop 2006
Lawrence Cavedon | Ingrid Zukerman
Proceedings of the Australasian Language Technology Workshop 2006
2005
A Flexible Conversational Dialog System for MP3 Player
Fuliang Weng | Lawrence Cavedon | Badri Raghunathan | Danilo Mirkovic | Ben Bei | Heather Pon-Barry | Harry Bratt | Hua Cheng | Hauke Schmidt | Rohit Mishra | Brian Lathrop | Qi Zhang | Tobias Scheideck | Kui Xu | Tess Hand-Bender | Stanley Peters | Liz Shriberg | Carsten Bergmann
Proceedings of HLT/EMNLP 2005 Interactive Demonstrations
Fuliang Weng | Lawrence Cavedon | Badri Raghunathan | Danilo Mirkovic | Ben Bei | Heather Pon-Barry | Harry Bratt | Hua Cheng | Hauke Schmidt | Rohit Mishra | Brian Lathrop | Qi Zhang | Tobias Scheideck | Kui Xu | Tess Hand-Bender | Stanley Peters | Liz Shriberg | Carsten Bergmann
Proceedings of HLT/EMNLP 2005 Interactive Demonstrations
Combining Confidence Scores with Contextual Features for Robust Multi-Device Dialogue
Lawrence Cavedon | Matthew Purver | Florin Ratiu
Proceedings of the Australasian Language Technology Workshop 2005
Lawrence Cavedon | Matthew Purver | Florin Ratiu
Proceedings of the Australasian Language Technology Workshop 2005
2004
Multi-Human Dialogue Understanding for Assisting Artifact-Producing Meetings
John Niekrasz | Alexander Gruenstein | Lawrence Cavedon
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics
John Niekrasz | Alexander Gruenstein | Lawrence Cavedon
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics
2003
Search
Fix author
Co-authors
- Ingrid Zukerman 4
- Su Nam Kim 3
- Tatyana Shmanina 3
- Karin Verspoor 3
- Timothy Baldwin 2
- Oliver Lemon 2
- Rahmad Mahendra 2
- Lin Padgham 2
- Damiano Spina 2
- Carole Adam 1
- Ben Bei 1
- Carsten Bergmann 1
- Thomas Bochynek 1
- Harry Bratt 1
- Ai Lee Cheam 1
- Hua Cheng 1
- Alexander Gruenstein 1
- Tess Hand-Bender 1
- Antonio Jimeno Yepes 1
- Barbara Kelly 1
- Kazunori Komatani 1
- Brian Lathrop 1
- Yi Li 1
- Diane Litman 1
- David Martinez 1
- Danilo Mirkovic 1
- Rohit Mishra 1
- Mikio Nakano 1
- John Niekrasz 1
- Alexandros Papangelis 1
- Stanley Peters 1
- Heather Pon-Barry 1
- Matthew Purver 1
- Badri Raghunathan 1
- Florin Ratiu 1
- Tobias Scheideck 1
- Hauke Schmidt 1
- Liz Shriberg 1
- Nicola Stokes 1
- John Thangarajah 1
- Fuliang Weng 1
- Simon Whitehead 1
- Wilson Wong 1
- Wern Wong 1
- Kui Xu 1
- Kai Yu 1
- Qi Zhang 1
- Justin Zobel 1