Jingchao Ni


2026

Time series data are integral to applications across domains such as finance, healthcare, transportation, and environmental science.While recent work has begun to explore time series question answering (QA), existing benchmarks still provide limited coverage of analytical capabilities under a standardized evaluation framework. We introduce TSAQA, a novel unified benchmark designed to broaden task coverage and evaluate diverse temporal analysis capabilities. TSAQA integrates 6 diverse tasks under a single framework ranging fromconventional analysis, including anomaly detection and classification, to advanced analysis, such as characterization, comparison, datatransformation, and temporal relationship analysis. Spanning 210k samples across 13 domains, the dataset employs diverse formats, including true-or-false (TF), multiple-choice (MC), and a novel puzzling (PZ), to comprehensively assess time series analysis. Zero-shotevaluation shows that TSAQA remains challenging for current Large Language Models (LLMs): best-performing commercial model,Gemini-2.5-Flash, achieves 65.08 average accuracy. Although instruction tuning improves open-source models’ performance: the best-performing model, LLaMA-3.1-8B, shows significant room for improvement. We further evaluate language-capable time series foundation models (TSFMs), showing that TSAQA extends beyond general-purpose LLMs. The data are available in https://huggingface.co/datasets/TSAQA/TSAQA-Benchmark.

2025

Causal discovery is an imperative foundation for decision-making across domains, such as smart health, AI for drug discovery and AIOps. Traditional statistical causal discovery methods, while well-established, predominantly rely on observational data and often overlook the semantic cues inherent in cause-and-effect relationships. The advent of Large Language Models (LLMs) has ushered in an affordable way of leveraging the semantic cues for knowledge-driven causal discovery, but the development of LLMs for causal discovery lags behind other areas, particularly in the exploration of multi-modal data. To bridge the gap, we introduce MatMCD, a multi-agent system powered by tool-augmented LLMs. MatMCD has two key agents: a Data Augmentation agent that retrieves and processes modality-augmented data, and a Causal Constraint agent that integrates multi-modal data for knowledge-driven reasoning. The proposed design of the inner-workings ensures successful cooperation of the agents. Our empirical study across seven datasets suggests the significant potential of multi-modality enhanced causal discovery.

2021

Measuring document similarity plays an important role in natural language processing tasks. Most existing document similarity approaches suffer from the information gap caused by context and vocabulary mismatches when comparing varying-length texts. In this paper, we propose an unsupervised concept representation learning approach to address the above issues. Specifically, we propose a novel Concept Generation Network (CGNet) to learn concept representations from the perspective of the entire text corpus. Moreover, a concept-based document matching method is proposed to leverage advances in the recognition of local phrase features and corpus-level concept features. Extensive experiments on real-world data sets demonstrate that new method can achieve a considerable improvement in comparing length-varying texts. In particular, our model achieved 6.5% better F1 Score compared to the best of the baseline models for a concept-project benchmark dataset.