Weihong Deng

2025

pdf bib abs
What Counts Underlying LLMs’ Moral Dilemma Judgments?
Wenya Wu | Weihong Deng
Proceedings of the Fourth Workshop on NLP for Positive Impact (NLP4PI)

Moral judgments in LLMs increasingly capture the attention of researchers in AI ethics domain. This study explores moral judgments of three open-source large language models (LLMs)—Qwen-1.5-14B, Llama3-8B, and DeepSeek-R1 in plausible moral dilemmas, examining their sensitivity to social exposure and collaborative decision-making. Using a dual-process framework grounded in deontology and utilitarianism, we evaluate LLMs’ responses to moral dilemmas under varying social contexts. Results reveal that all models are significantly influenced by moral norms rather than consequences, with DeepSeek-R1 exhibiting a stronger action tendency compared to Qwen-1.5-14B and Llama3-8B, which show higher inaction preferences. Social exposure and collaboration impact LLMs differently: Qwen-1.5-14B becomes less aligned with moral norms under observation, while DeepSeek-R1’s action tendency is moderated by social collaboration. These findings highlight the nuanced moral reasoning capabilities of LLMs and their varying sensitivity to social cues, providing insights into the ethical alignment of AI systems in socially embedded contexts.

2022

This survey aims to sort out the recent advances in video question answering (VideoQA) and point towards future directions. We firstly categorize the datasets into 1) normal VideoQA, multi-modal VideoQA and knowledge-based VideoQA, according to the modalities invoked in the question-answer pairs, or 2) factoid VideoQA and inference VideoQA, according to the technical challenges in comprehending the questions and deriving the correct answers. We then summarize the VideoQA techniques, including those mainly designed for Factoid QA (e.g., the early spatio-temporal attention-based methods and the recently Transformer-based ones) and those targeted at explicit relation and logic inference (e.g., neural modular networks, neural symbolic methods, and graph-structured methods). Aside from the backbone techniques, we delve into the specific models and find out some common and useful insights either for video modeling, question answering, or for cross-modal correspondence learning. Finally, we point out the research trend of studying beyond factoid VideoQA to inference VideoQA, as well as towards the robustness and interpretability. Additionally, we maintain a repository, https://github.com/VRU-NExT/VideoQA, to keep trace of the latest VideoQA papers, datasets, and their open-source implementations if available. With these efforts, we strongly hope this survey could shed light on the follow-up VideoQA research.

Co-authors

Yaoyao Zhong 1

Venues

Fix author